Skip to content
The Comm Spot The Comm Spot

It's All About Communication

  • HOME
  • ABOUT
    • Meet the Creator: Curtis Newbold, PhD
    • Hire Curtis
    • Publish with Us
    • Contact
    • Privacy Notice
  • COMM-BASICS
    • Glossary
    • Citation & Style Guides
      • AP Style (Journalism)
        • AP Style Overview
        • AP Style Guidelines
        • Media Ethics – SPJ Code of Ethics
        • Elements of Newsworthiness
      • APA Style
        • APA Format – Overview
        • APA - References Guidelines
        • APA - In-text Citations
        • APA - Citing Authors
        • APA - Audiovisual Media
        • APA - Books
        • APA - Digital Sources
        • APA - Events & Interactions
        • APA - Periodicals
        • APA - Print Sources (other than books)
      • Chicago Style
        • Chicago – Overview
        • Chicago - Author-Date System
        • Chicago - Notes-Bibliography System
        • Chicago - In-text Citations
        • Chicago Style - Citing Authors
        • Chicago - Audiovisual Media
        • Chicago - Books
        • Chicago - Digital Sources
        • Chicago - Events and Interactions
        • Chicago - Citing Periodicals
        • Chicago - Print Sources (other than books)
      • MLA Style
        • MLA Overview
        • MLA Works Cited Pages
        • MLA In-text Citations
        • MLA – Authors
        • MLA – Audiovisual Media
        • MLA – Books
        • MLA – Digital Sources
        • MLA – Events & Interactions
        • MLA – Periodicals
        • MLA – Print Sources (other than books)
    • Rhetoric
      • Overview of Rhetoric
      • Rhetorical Appeals (Rhetorical Triangle)
      • Branches of Oratory
      • Canons of Rhetoric
      • Rhetorical Devices
      • Kairos
      • Topos
      • Key Figures in Rhetoric
    • Research Methods
      • Case Studies
      • Competitor Analysis
      • Content Analysis
      • Discourse Analysis
      • Ethnography
      • Focus Groups
      • Observation Research
      • S.W.O.T. Analysis
      • Secondary Research
      • Surveys
      • Target Market Analysis
      • Usability Testing
      • Visual Analysis
    • Theories
    • Thinkers
  • COMM-SUBJECTS
    • Interpersonal Communication
      • Active Listening
      • Body Language
      • Conflict Management
      • Emotional Intelligence
        • Emotional Intelligence Overview
        • Self-Awareness
        • Self-Regulation
        • Motivation
        • Empathy
        • Social Skills
        • Emotional Intelligence Resources
      • Feedback
      • Negotiation
        • Overview of Negotiation
        • Negotiation Skills
        • Negotiation Strategies & Techniques
        • Stages of Negotiation
        • Common Negotiation Scenarios
        • Negotiation Case Studies & Examples
        • Negotiation Tools & Resources
        • Negotiation FAQ
    • Journalism
    • Public Speaking
      • General Guidelines
      • Overcoming Fear
      • Speech Writing and Organization
      • Delivery Techniques
      • Body Language
      • Audience Engagement
      • Storytelling
      • Designing Slides
      • P.O.W.E.R.F.U.L. Presentation Method
    • Strategic Communication
      • Business & Org Comm
        • Definition & History
        • Org Comm Theories
        • Business Documents
        • Change Management
        • Employee Relations
        • Employment Communication
        • Group & Team Communication
        • Leadership Communication
        • Power, Identity, & Ethics at Work
        • Project Management
      • Integrated Marketing Comm
        • Definition of IMC
        • Core Principles of IMC
        • IMC Planning
        • Audience Segmentation
        • Marketing Channels
        • Message Strategies
        • Campaign Measurement & Evaluation
        • Trends & Innovations in IMC
        • Challenges & Pitfalls in IMC
        • Careers & Roles in IMC
      • Public Relations
        • Foundations in PR
        • Strategic Practice
        • Tools & Tactics
        • Research & Analysis
        • Professional Development
      • Case Studies in Strat Comm
    • Technical & Scientific Communication
    • Visual Communication
      • Data Visualization
      • Information Design
      • Photography
      • Web Design
    • Written Communication
      • Writing Process
      • Organizational Methods
        • Five Paragraph Essay
        • Hourglass Method of Writing
        • IMRaD Format (Science)
        • Indirect Method (Bad News)
        • Inverted Pyramid (Journalism)
        • Martini Glass
        • Narrative Format
        • Proposal Format
        • Rogerian Method
        • Toulmin Method
      • Plain Language
        • Audience (Plain Language)
        • Organization (Plain Language)
        • Conversation (Plain Language)
        • Simplicity (Plain Language)
        • Word Choice (Plain Language)
        • Sentence Structure (Plain Language)
        • Design (Plain Language)
      • Punctuation
        • Apostrophes
        • Brackets
        • Colons
        • Commas
        • Ellipses
        • Em Dashes
        • En Dashes
        • Exclamation Marks
        • Hyphens
        • Parentheses
        • Periods
        • Question Marks
        • Quotation Marks
        • Semicolons
      • Style
        • Clarity
        • Conciseness
        • Consistency
        • Editing
        • Flow
        • Rhetorical Devices
        • Sentence Structure
        • Storytelling
        • Tone
        • Voice
        • Word Choice
  • RESOURCES
    • Teaching Resources
      • Assignments & Activities
      • Instructional Design
      • Pedagogies
  • BLOGS
    • The Spotlight Blog
    • Comm Sparks
  • SHOP
    • Cart
    • Checkout
0
The Comm Spot
The Comm Spot

It's All About Communication

Data Cleaning and Preparation Exercise – Data Visualization Assignment

Home >COMM-Subjects >Visual Communication >Data Visualization >Teaching Data Visualization >Data Visualization Assignments >Data Cleaning and Preparation Exercise – Data Visualization Assignment

Before any effective data visualization can be created, the underlying data must be clean, organized, and structured correctly. Raw datasets often contain inconsistencies, missing values, formatting issues, or unnecessary variables that make accurate visualization difficult. If these problems are not addressed, even well-designed charts can become misleading or confusing.

This Data Cleaning and Preparation Exercise helps undergraduate students develop a critical foundational skill in data visualization: preparing datasets for analysis and visual communication. Instead of immediately creating charts, students will begin by examining a messy or incomplete dataset and systematically improving its quality.

By completing this assignment, students will learn how data cleaning decisions influence the accuracy, clarity, and reliability of visualizations. They will also gain practical experience preparing datasets so that visualizations can communicate information clearly and ethically.


Why This Data Visualization Assignment Matters

Many students assume that data visualization begins when charts are created. In practice, however, most data analysis and visualization projects begin with data preparation. Datasets collected from surveys, government records, web scraping, or organizational databases often include issues such as:

  • Missing or incomplete values
  • Inconsistent formatting
  • Duplicate entries
  • Misaligned columns
  • Incorrect data types
  • Irrelevant or redundant variables

These issues can lead to inaccurate visualizations or misleading interpretations if they are not corrected.

Professional analysts and researchers often spend a significant portion of their time cleaning and preparing data before building visualizations. Developing this skill early helps students understand that good visualization depends on reliable underlying data.

This assignment encourages students to approach datasets critically, identify problems, and prepare data for accurate visual communication.


Learning Outcomes

By completing this assignment, students will be able to:

  • Identify common data quality problems in raw datasets
  • Organize and structure data for visualization
  • Correct formatting inconsistencies
  • Remove or consolidate duplicate information
  • Identify missing values and determine how to handle them
  • Prepare datasets that can be used for accurate charts and analysis
  • Explain the reasoning behind data preparation decisions

Assignment Overview

In this project, students will receive or locate a dataset that contains formatting inconsistencies or other common data issues. Their task is to examine the dataset, identify problems that could affect visualization, and produce a cleaned and organized version of the data.

Students will then explain the steps they took to improve the dataset and how those changes support clearer visualization.

The assignment emphasizes:

  • Data quality awareness
  • Analytical decision-making
  • Ethical data preparation
  • Transparency in data handling

This assignment works well in:

  • Introductory data visualization courses
  • Communication and journalism classes
  • Business analytics courses
  • Research methods courses
  • Technical writing courses
  • Information design courses

Students may use common data visualization tools such as:

  • Excel
  • Google Sheets
  • Tableau Prep
  • Power BI
  • OpenRefine
  • R or Python

The goal is not advanced programming but careful examination and preparation of the data.


Deliverables

Students will submit:

  • The original dataset used in the assignment
  • A cleaned and organized version of the dataset
  • A written explanation of the cleaning steps performed
  • A professionally formatted submission file containing both datasets and the written analysis

The cleaned dataset should demonstrate:

  • Consistent formatting
  • Clearly labeled variables
  • Removal or correction of duplicate entries
  • Logical organization of rows and columns
  • Preparation for future visualization

Read Next Assignment Description: Chart Type Comparison Project


Step-by-Step Instructions for Students

Step One: Examine the Raw Dataset

Begin by carefully reviewing the dataset provided by your instructor or selected from a public data source.

Look for common data issues such as:

  • Missing values
  • Duplicate entries
  • Inconsistent date formats
  • Mixed data types within columns
  • Columns with unclear labels
  • Rows that contain irrelevant information

Spend time exploring the dataset before making changes.

Write a short planning paragraph describing what you observe and what problems might affect visualization.


Step Two: Identify Data Quality Issues

Create a list of the specific problems you observe in the dataset.

Examples may include:

  • Empty cells where data should exist
  • Duplicate rows representing the same observation
  • Text entries where numerical values are expected
  • Inconsistent capitalization or spelling of categories
  • Multiple variables combined in a single column

Understanding these issues will help you decide how to clean and organize the data.


Step Three: Correct Formatting Inconsistencies

Next, begin improving the structure of the dataset.

Common formatting improvements include:

  • Converting numbers stored as text into numerical values
  • Standardizing date formats
  • Ensuring consistent capitalization for categories
  • Separating combined values into separate columns
  • Renaming columns to clearly describe the variable they contain

These changes help ensure the dataset can be interpreted and visualized correctly.


Step Four: Address Missing Values

Datasets frequently contain missing or incomplete information.

When you encounter missing values, consider possible responses such as:

  • Leaving the value blank if the missing information is meaningful
  • Replacing the value with an appropriate placeholder
  • Removing rows that contain insufficient information

Explain the reasoning behind your decisions so that readers understand how the dataset was modified.


Step Five: Remove Duplicate or Irrelevant Entries

Duplicate rows or irrelevant records can distort analysis.

Carefully inspect the dataset to identify:

  • Duplicate observations
  • Rows that fall outside the scope of the dataset
  • Variables that are unrelated to the dataset’s purpose

Remove or consolidate these entries so that the dataset accurately represents the information being analyzed.


Step Six: Organize the Dataset for Visualization

Finally, organize the dataset so it can easily support visualization.

Ensure that:

  • Each row represents a single observation
  • Each column represents a single variable
  • Column names are descriptive and consistent
  • Categories are standardized across the dataset

At the end of this step, the dataset should be ready for use in charts, graphs, or further analysis.


Step Seven: Write a Data Preparation Explanation

In the written portion of the assignment, explain:

  • The issues you discovered in the original dataset
  • The steps you took to clean and organize the data
  • Why those changes were necessary
  • How the cleaned dataset supports accurate visualization

Your explanation should demonstrate careful reasoning and transparency in your data preparation process.


Assessment Criteria

This data visualization assignment will be evaluated based on the following criteria:

Identification of Data Issues

  • Clear recognition of formatting problems
  • Accurate identification of inconsistencies or missing values

Data Cleaning Quality

  • Logical and effective corrections to the dataset
  • Consistent formatting and organization
  • Removal of duplicates or irrelevant entries

Analytical Explanation

  • Clear explanation of the cleaning process
  • Thoughtful reasoning behind data preparation decisions
  • Awareness of how preparation affects visualization

Professional Presentation

  • Organized layout of datasets and explanation
  • Clear documentation of changes made
  • Polished and readable writing

Strong submissions demonstrate careful data examination and thoughtful preparation for visualization.


Common Student Mistakes to Avoid

Students frequently encounter the following challenges during data cleaning:

  • Making changes without documenting them
  • Removing data without explaining why
  • Leaving inconsistent formatting unresolved
  • Overlooking duplicate records
  • Failing to rename unclear variables

Remember that transparent and well-documented data preparation is essential for trustworthy analysis.


Related Assignments

Continue developing your data visualization skills with these related projects:

  • Chart Type Comparison Project
  • Bar Chart Design Basics
  • Line Graph for Trends Analysis
  • Pie Chart Redesign Challenge
  • Choosing the Right Chart Assignment
  • Axis and Scale Integrity Audit

These assignments build on your ability to prepare, analyze, and visualize data effectively.


*Content on this page was curated and edited by expert humans with the creative assistance of AI.

  • facebook
  • instagram
  • linkedin

DON'T MISS ANY SPOT-ON TIPS!

We don't spam! You'll only get emails when we post something awesome.
You can unsubscribe at any time.

Check your inbox or spam folder to confirm your subscription.

©2025 | The Comm Spot | By Newbold Communication & Design