Skip to main content

Data Science 1 with Python (303-1-21)

Instructors

Lizhen Shi

Meeting Info

Swift Hall 107: Mon, Wed 5:30PM - 6:50PM
Location of Midterm TBD: Fri 7:00PM - 9:00PM

Overview of class

Only Statistics majors, Data Science majors, Data Science minors, and Applied Statistics Masters students assigned to take 303-1 in this quarter are able to register for this course.

First course in Data Science, with focus on data management, manipulation, and visualization skills and techniques for exploratory data analysis. The course also introduces the Python programming language in the context of Data Science. Students may not receive credit for both this course and STAT 301-1.

Registration Requirements

Prerequisites: STAT 201 or equivalent and STAT 202-0 or STAT 210 or consent of the instructor.

Only Statistics majors, Data Science majors, Data Science minors, and Applied Statistics Masters students assigned to take 303-1 in this quarter are able to register for this course.

Learning Objectives

At the end of the course, students should be able to:
1. Translate a problem described in layman terms to a data science project.
2. Acquire, integrate, and store data from various sources.
3. Manipulate, clean, and transform data to make it suitable for answering the question at hand.
4. Visualize, explore, and analyze data to identify patterns and gather insights.
5. Demonstrate proficiency with coding in the Python programming language, in the context of data science.
6. Collaborate in a team to develop a complete data science solution that answers a question of interest.

Teaching Method

Classes will consist of interactive lectures. Students are expected to read the book and come to class. The instructor will build up on the content of the book, and discuss complicated examples. Students will be asked questions during class, engagement is necessary. Everyone must bring their own laptop in each class, as coding in Python will be required. Python installation on laptop is necessary.

Evaluation Method

Students will be assessed on the learning objectives with:
1. Weekly Assignments: Students will have weekly assignments to practice and demonstrate the coding techniques, tools and methods taught during class hours. These assignments will test students on learning objectives 2,3,4 and 5.
2. Mid-term exam: Students will have a mid-term exam, where they will be provided with a dataset to answer a set of questions. This assessment will test students on learning objectives 1,2,3 and 5.
3. Final exam: Students will have a final exam, where they will be provided with multiple datasets to answer a few broad questions. This assessment will test students on the learning objectives 1,2,3, 4 and 5.
4. Course project: Students will have the freedom to identify a problem of their choice, and leverage data to solve it. This assessment will test students on all the learning objectives.
5. Class participation: Students will have the opportunity to earn bonus class participation points by answering questions in class or on the online class forum.

Class Materials (Required)

Krishna, A., Shi, L., Besler, E., and Kuyper A., 'Introduction to Data Science with Python' (2022), https://nustat.github.io/DataScience_Intro_python/.

McKinney, W. (2017). Python for data analysis: Data wrangling with Pandas, NumPy, and IPython. 2nd Edition. O'Reilly Media, Inc. ISBN-13: 978-1491957660
ISBN-10: 1491957662

Class Materials (Suggested)

Reference book: VanderPlas, J. (2016). Python data science handbook: Essential tools for working with data. " O'Reilly Media, Inc.". ISBN-13: 978-1491912058
ISBN-10: 1491912057

Sample online material for Python and libraries:
Python for Beginners: https:/

Class Attributes

Formal Studies Distro Area

Enrollment Requirements

Enrollment Requirements: Prerequisite: STAT 201-0 or COMP_SCI 110-0 and STAT 202-0 or STAT 210-0 or STAT 232-0 or PSYCH 201-0 or IEMS 201-0 or IEMS 303-0 or equivalent.
Add Consent: Department Consent Required