Data Science 1 with R (301-1-20)
Instructors
Arend Matthew Kuyper
IPR, 2040 Sheridan Road, Evanston
Meeting Info
Frances Searle Building 1441: Mon, Wed 12:30PM - 1:50PM
Overview of class
Only Statistics majors, Data Science minors, Data Science majors, and Applied Statistics Masters students assigned to take 301-1 in this quarter are able to register for this course.
As the initial course in the STAT 301 Data Science series, our objective is to build foundational analytical skills and knowledge for data science. While many data science problems will require knowledge of multiple coding languages and technologies, this course chooses to utilize R and RStudio to conduct data science. Students will develop skills to effectively manage, manipulate, and analyze data in R. These skills will be developed through project/lab based learning (1-2 projects/labs per week). The skills developed in this course are necessary for STAT 301-2 and STAT 301-3. ATTENDANCE AT THE FIRST CLASS IS MANDATORY
Registration Requirements
Prerequisites: STAT 201 or equivalent and STAT 202-0 or STAT 210 or consent of the instructor.
Only Statistics majors, Data Science minors, Data Science majors, and Applied Statistics Masters students assigned to take 301-1 in this quarter are able to register for this course.
Learning Objectives
(1) Students will be able to identify and follow the general steps/stages required to complete a typical data science project; (2) Students will learn and demonstrate the use of standard data science tools necessary to effectively and efficiently import, transform, visualize, model, and communicate data; and (3) Students will learn and demonstrate the use of programming techniques to further augment standard data science tools. (4) Students will be able to conduct and properly communicate a thorough exploratory data analysis.
Teaching Method
A typical class will devote about 10-20 minutes to discussion/lecture with the remaining time devoted to projects/labs where students will either work by themselves or in groups. Students will be expected to adequately prepare for each discussion/lecture by reviewing assigned material (e.g. readings, videos, etc) because the majority of class time will be spent working on projects/labs - designed around the assigned material. Students will be expected to collaborate and engage with other students to help each other learn and solve problems.
Evaluation Method
There will be a final project in place of a written exam. We will also evaluate progress throughout the quarter using project/lab-based learning (1-2 projects/labs per week) and other miscellaneous assessment strategies (for example: short discussions, surveys, and proficiency quizzes).
Class Materials (Required)
(1) R for Data Science by Hadley Wickham & Garrett Grolemund -- Free online version: http://r4ds.had.co.nz/
(2) Free statistical software R (https://cran.rstudio.com/)
(3) Free integrated development environment software RStudio (https://www.rstudio.com/). Think of R as the car engine needed to power and run everything while RStudio is the steering wheel/dashboard that we use to run and control the car.
Class Notes
ATTENDANCE AT THE FIRST CLASS IS MANDATORY
Class Attributes
Formal Studies Distro Area
Enrollment Requirements
Enrollment Requirements: Prerequisite: STAT 201-0 or COMP_SCI 110-0 and STAT 202-0 or STAT 210-0 or STAT 232-0 or PSYCH 201-0 or IEMS 201-0 or IEMS 303-0 or equivalent.
Add Consent: Department Consent Required