Data Science 1 with Python (303-1-22)
Instructors
Lizhen Shi
Meeting Info
Harris Hall L07: Tues, Thurs 5:00PM - 6:20PM
Location of Midterm TBD: Fri 7:00PM - 9:00PM
Overview of class
Only Statistics majors, Data Science majors, Data Science minors, and Applied Statistics Masters students assigned to take 303-1 in this quarter are able to register for this course.
First course in Data Science, with focus on data management, manipulation, and visualization skills and techniques for exploratory data analysis. The course also introduces the Python programming language in the context of Data Science. Students may not receive credit for both this course and STAT 301-1.
Registration Requirements
Prerequisites: STAT 201 or equivalent and STAT 202-0 or STAT 210 or consent of the instructor.
Learning Objectives
At the end of the course, students should be able to:
1. Translate a problem described in layman terms to a data science project.
2. Acquire, integrate, and store data from various sources.
3. Manipulate, clean, and transform data to make it suitable for answering the question at hand.
4. Visualize, explore, and analyze data to identify patterns and gather insights.
5. Demonstrate proficiency with coding in the Python programming language, in the context of data science.
6. Collaborate in a team to develop a complete data science solution that answers a question of interest.
Teaching Method
Classes will be a combination of "lectures" + "lab sessions". The course material will be introduced in the "lectures" portion of the classes. Lectures are expected to be interactive. In the lab sessions, students will be given problem(s) to solve. At least one solution for each problem will be discussed. Students are encouraged to ask questions and collaborate during the session. Everyone must bring their own laptop in each class, as coding in Python will be required. Python installation on laptop is necessary.
Evaluation Method
Students will be assessed on the learning objectives with:
1. Weekly Assignments: Students will have weekly assignments to practice and demonstrate the coding techniques, tools and methods taught during class hours. These assignments will test students on learning objectives 2,3,4 and 5.
2. Mid-term exam: Students will have a mid-term exam, where they will be provided with a dataset to answer a set of questions. This assessment will test students on learning objectives 1,2,3 and 5.
3. Final exam: Students will have a final exam, where they will be provided with multiple datasets to answer a few broad questions. This assessment will test students on the learning objectives 1,2,3, 4 and 5.
4. Course project: Students will have the freedom to identify a problem of their choice, and leverage data to solve it. This assessment will test students on all the learning objectives.
5. Class participation: Students will have the opportunity to earn bonus class participation points by answering questions in class or on the online class forum.
Class Materials (Required)
Krishna, A., Shi, L., Besler, E., and Kuyper A., 'Introduction to Data Science with Python' (2022), https://nustat.github.io/DataScience_Intro_python/.
McKinney, W. (2017). Python for data analysis: Data wrangling with Pandas, NumPy, and IPython. 2nd Edition. O'Reilly Media, Inc. ISBN-13: 978-1491957660
ISBN-10: 1491957662
Class Materials (Suggested)
Reference book: VanderPlas, J. (2016). Python data science handbook: Essential tools for working with data. " O'Reilly Media, Inc.". ISBN-13: 978-1491912058
ISBN-10: 1491912057
Sample online material for Python and libraries:
Python for Beginners: https:/
Class Attributes
Formal Studies Distro Area
Enrollment Requirements
Enrollment Requirements: Students must have completed STAT 202-0 or STAT 210-0 to enroll in this course.
Add Consent: Department Consent Required