Data Science 2 with Python (303-2-22)
Instructors
Lizhen Shi
Meeting Info
Harris Hall 107: Tues, Thurs 5:00PM - 6:20PM
Location of Midterm TBD: Fri 7:00PM - 9:00PM
Overview of class
Only Statistics majors, Data Science minors, and Statistics Masters students assigned to take 303-2 in this quarter are able to register for this course.
This course introduces supervised machine learning in Python, with a focus on linear and logistic regression. It prepares students for learning advanced machine learning methods.
Registration Requirements
STAT 303-1 or consent of the instructor.
Learning Objectives
At the end of the course, students should be able to:
1. Translate a problem described in layman terms to a regression problem.
2. Identify the suitability of regression for a given problem.
3. Develop, interpret, and validate regression models.
4. Integrate regression modeling as a component of the larger data science project.
5. Demonstrate proficiency with coding in the Python programming language, in the context of regression.
6. Collaborate in a team to develop a complete regression-based data science solution that answers a question of interest.
Evaluation Method
1. Weekly Assignments: Students will have weekly assignments to practice and demonstrate the coding techniques, tools and methods taught during class hours. These assignments will test students on learning objectives 2,3 and 5.
2. Mid-term exam: Students will have a mid-term exam, where they will be provided with a dataset to develop a linear regression model. This assessment will test students on learning objectives 1,2,3 and 5.
3. Final exam: Students will have a final exam, where they will be provided with multiple datasets and a problem to develop a data science solution involving regression. This assessment will test students on the learning objectives 1,2,3, 4 and 5.
4. Course project: Students will have the freedom to identify a problem of their choice and develop a complete data science solution that includes regression. This assessment will test students on all the learning objectives.
5. Class participation: Students will have the opportunity to earn bonus class participation points by answering questions in class or on the online class forum.
Class Materials (Required)
An Introduction to Statistical Learning with Applications in R' by James, Witten, Hastie, Tibshirani (2013), with Python codes https://github.com/JWarmenhoven/ISLR-python, ISBN-13: 978-1461471370 (available for free online)
Class Materials (Suggested)
Linear Models with Python by Julian J. Faraway
Python data science handbook by Jake VanderPlas
The Elements of Statistical Learning by Trevor Hastie, Robert Tibshirani and Jerome Friedman
Class Attributes
Formal Studies Distro Area
Enrollment Requirements
Enrollment Requirements: Prerequisite: STAT 303-1 or consent of the instructor.
Add Consent: Department Consent Required