Skip to main content

Data Visualization (302-0-20)

Instructors

Arend Matthew Kuyper
IPR, 2040 Sheridan Road, Evanston

Meeting Info

Harris Hall L07: Mon, Wed 3:30PM - 4:50PM

Overview of class

Only Statistics majors, Data Science minors, Data Science majors, and Applied Statistics Masters students assigned to take 302 in this quarter are able to register for this course.

Data visualization (dataviz) plays an important role in both data exploration, analysis, and communication. Quality data visualizations have the power to help analysts better understand and effectively communicate their work. Students will learn how to construct quality data visualizations using RStudio. The course will address visualizing data of various forms (e.g. panel, spatial, temporal, etc.) and from various statistical domains such as inferential, descriptive, and predictive statistics. The course will focus on principles and techniques used to create static visualizations. The course will also provide an introduction to interactive visualizations and dashboards. These skills will be developed through project/lab-based learning (1-2 projects/labs per week). A significant proportion of the course will be dedicated to a large-scale project.

Registration Requirements

Only Statistics majors, Data Science minors, Data Science majors, and Applied Statistics Masters students assigned to take 302 in this quarter are able to register for this course.

At least an introductory understanding of statistics is necessary (i.e. STAT 202 or 210).

Learning Objectives

Students will be able to communicate the core concepts of the grammar of graphics that underlie all static statistical graphics. Students will be able to utilize the grammar of graphics (as implemented in R) to construct static data visualizations tailored to various types of datasets. Students will be able to construct basic interactive data visualizations.

Teaching Method

A typical class will devote about 10-20 minutes to discussion/lecture with the remaining time devoted to projects/labs where students will either work by themselves or in groups. Students will be expected to adequately prepare for each discussion/lecture by reviewing assigned material (e.g. readings, videos, etc.) because the majority of class time will be spent working on projects/labs - designed around the assigned material. Students will be expected to collaborate and engage with other students to help each other learn and solve problems.

Evaluation Method

There will be a final project. We will also evaluate progress throughout the quarter using project/lab-based learning (1-2 projects/labs per week). Other assessment types may also be used at the instructor's discretion.

Class Materials (Required)

(1) Free online textbook, ggplot2: Elegant Graphics for Data Analysis, 3rd Edition: https://ggplot2-book.org/index.html
(2) Free online textbook, Fundamentals of Data Visualization: https://clauswilke.com/dataviz/
(3) Free online textbook, Mastering Shiny: Build Interactive Apps, Reports & Dashboards Powered by R: https://mastering-shiny.org/index.html
(4) Free statistical software R (https://cran.rstudio.com/)
(5) Free integrated development environment software RStudio (https://www.rstudio.com/). Think of R as the car engine needed to power and run everything while RStudio is the steering wheel/dashboard that we use to run and control the car.

Class Materials (Suggested)

Grammar of Graphics (Statistics and Computing) by Leland Wilkinson. Springer-Verlag New York, 2005. ISBN: 9780387245447 (Print) 9780387286952 (Online). Northwestern students can access a free pdf version through the library.

Class Notes

ATTENDANCE AT THE FIRST CLASS IS MANDATORY

Class Attributes

Formal Studies Distro Area