Skip to main content

Data Science 1 with R (301-1-20)

Instructors

Danielle Kaye Sass

Meeting Info

Swift Hall 107: Mon, Wed 12:30PM - 1:50PM

Overview of class

Only Statistics majors, Data Science minors, Data Science majors, and Applied Statistics Masters students assigned to take 301-1 in this quarter are able to register for this course.

As the initial course in the STAT 301 Data Science series, our objective is to build foundational analytical skills and knowledge for data science. While many data science problems will require knowledge of multiple coding languages and technologies, this course chooses to utilize R and RStudio to conduct data science. Students will develop skills to effectively manage, manipulate, and analyze data in R. These skills will be developed through project/lab based learning (1-2 projects/labs per week). The skills developed in this course are necessary for STAT 301-2 and STAT 301-3. ATTENDANCE AT THE FIRST CLASS IS MANDATORY

Registration Requirements

Prerequisites: STAT 201 or equivalent and STAT 202-0 or STAT 210 or consent of the instructor.

Learning Objectives

(1) Students will be able to identify and follow the general steps/stages required to complete a typical data science project; (2) Students will learn and demonstrate the use of standard data science tools necessary to effectively and efficiently import, transform, visualize, model, and communicate data; and (3) Students will learn and demonstrate the use of programming techniques to further augment standard data science tools. (4) Students will be able to conduct and properly communicate a thorough exploratory data analysis.

Teaching Method

A typical class will devote about 10-20 minutes to discussion/lecture with the remaining time devoted to projects/labs where students will either work by themselves or in groups. Students will be expected to adequately prepare for each discussion/lecture by reviewing assigned material (e.g. readings, videos, etc…) because the majority of class time will be spent working on projects/labs - designed around the assigned material. Students will be expected to collaborate and engage with other students to help each other learn and solve problems. There will be a final project.

Class Materials (Required)

(1) R for Data Science by Hadley Wickham & Garrett Grolemund -- Free online version: http://r4ds.had.co.nz/
(2) Free statistical software R (https://cran.rstudio.com/)
(3) Free integrated development environment software RStudio (https://www.rstudio.com/). Think of R as the car engine needed to power and run everything while RStudio is the steering wheel/dashboard that we use to run and control the car.

Class Attributes

Formal Studies Distro Area

Enrollment Requirements

Enrollment Requirements: Students must have completed STAT 202-0 or STAT 210-0 to enroll in this course.
Add Consent: Department Consent Required