Data Science 1 with Python (303-1-20)
Instructors
Lizhen Shi
Meeting Info
University Hall 122: Mon, Wed 12:30PM - 1:50PM
Overview of class
Only Statistics majors, Data Science majors, Data Science minors, and Applied Statistics Masters students assigned to take 303-1 in this quarter are able to register for this course.
Python has emerged as a powerful tool for data science in recent years, thanks to its rich ecosystem of libraries and easy-to-understand syntax. In this quarter, we delve deep into the fundamentals of Python for data science, focusing on essential libraries such as NumPy, Matplotlib, Seaborn, and Pandas. Through hands-on exercises and real-world examples, students will gain proficiency in data manipulation, data visualization, and exploratory data analysis (EDA) techniques. Throughout the course, emphasis is placed on exploratory data analysis (EDA) techniques to understand the underlying patterns and relationships within datasets. Students will learn how to use statistical measures and visualization tools to uncover insights, identify outliers, and detect patterns in data.
Students may not receive credit for both this course and STAT 301-1.
Registration Requirements
Prerequisites: STAT 201 or equivalent and STAT 202-0 or STAT 210 or consent of the instructor.
Only Statistics majors, Data Science majors, Data Science minors, and Applied Statistics Masters students assigned to take 303-1 in this quarter are able to register for this course.
Learning Objectives
At the end of the course, students should be able to:
1. Translate a problem described in layman terms to a data science project.
2. Acquire, integrate, and store data from various sources.
3. Manipulate, clean, and transform data to make it suitable for answering the question at hand.
4. Visualize, explore, and analyze data to identify patterns and gather insights.
5. Demonstrate proficiency with coding in the Python programming language, in the context of data science.
6. Collaborate in a team to develop a complete data science solution that answers a question of interest.
Teaching Method
Classes will be a combination of "lectures" + "quizzes". Concepts will be introduced in the "lectures" portion of the classes, and students will be tested on those at the end of the class. In-class quizzes will help students be prepared to tackle the weekly assignment. Students are allowed to discuss and seek help during the in-class quizzes. Everyone must bring their own laptop in each class, as coding in Python will be required. Python installation on laptop is necessary. Everyone must bring their own laptop in each class, as coding in Python will be required. Python installation on laptop is necessary.
Evaluation Method
Students will be evaluated through (1) Quizzes, (2) Assignments, (3) Midterm Exam, (4) Final Exam, (5) Final Project, and (6) Participation.
Class Materials (Required)
Krishna, A., Shi, L., Besler, E., and Kuyper A., 'Introduction to Data Science with Python' (2022), https://nustat.github.io/DataScience_Intro_python/.
McKinney, W. (2017). Python for data analysis: Data wrangling with Pandas, NumPy, and IPython. 2nd Edition. O'Reilly Media, Inc. ISBN-13: 978-1491957660
ISBN-10: 1491957662
Class Materials (Suggested)
Reference book: VanderPlas, J. (2016). Python data science handbook: Essential tools for working with data. " O'Reilly Media, Inc.". ISBN-13: 978-1491912058
ISBN-10: 1491912057
Sample online material for Python and libraries:
Python for Beginners: https:/
Class Attributes
Formal Studies Distro Area
Enrollment Requirements
Enrollment Requirements: Prerequisite: STAT 201-0 or COMP_SCI 110-0 and STAT 202-0 or STAT 210-0 or STAT 232-0 or PSYCH 201-0 or IEMS 201-0 or IEMS 303-0 or equivalent.
Add Consent: Department Consent Required