Skip to main content

Advanced Machine Learning for Data Science (362-0-20)

Instructors

Emre Besler

Meeting Info

Frances Searle Building 2107: Tues, Thurs 3:30PM - 4:50PM

Overview of class

The Advanced Machine Learning course aims to give students a thorough understanding of the mathematical theory and implementation of various Machine Learning (ML) and Deep Learning (DL) models. The course will go beyond supervised learning to analyze unsupervised learning, recommendation engines and anomaly detection tasks. It will then move to the concept of big data, motivating the necessity of Deep Neural Networks. Different architectures of neural networks will be covered, implementing software that analyzes data in image and time-series format. The students are expected to have a basic understanding of ML concepts and terminology from the Data Science (STAT 301/303-1-2-3) sequence, on which the course builds a more mathematical foundation. The coding language is Python, which will be used to implement ML/DL projects on a variety of datasets.

Registration Requirements

STAT 303-1,2,3 or STAT 301-1,2,3

Learning Objectives

At the completion of this course, students should be able to:
- Derive and implement hard and soft clustering algorithms from scratch.
- Understand how to reduce the dimensionality of data and when it is necessary to do so.
- Create a short and practical Machine Learning software using scikit-learn library.
- Understand different ways to set up a Machine Learning task.
- Detect anomalies such as product malfunction and credit card fraud.
- Implement a recommendation engine for movies and online retail products.
- Understand Deep Learning theory and how to implement neural networks from scratch
- Create neural networks with specialized architecture to analyze image and time-series data
- Handle Keras and Tensorflow libraries to create neural networks

Teaching Method

Each class will be divided into a 50-minute lecture and 30-minute work time on an in-class assignment. Concepts and theory will be introduced in the lecture part and after that, students will work on their in-class assignments with the help of the Tas and the instructor present. The students are encouraged to ask questions and collaborate during the in-class work time. Everyone must bring their own laptop to each class, as coding in Python will be required. Installation of Anaconda Navigator is necessary.

Evaluation Method

In-class assignments (30%)
Homework assignments (50%)
Final paper (20%)

Class Materials (Required)

The course does not have any textbooks. The main content will be taught and uploaded in lecture notes. Any useful parts/exercises from various books will be uploaded as supplementary material.

Everyone must bring their own laptop to each class, as coding in Python will be required. Installation of Anaconda Navigator is necessary.

Class Attributes

Formal Studies Distro Area

Enrollment Requirements

Enrollment Requirements: Registration in this course is reserved for Data Science Majors only Prerequisites: STAT 301-3 or STAT 303-3.
Add Consent: Department Consent Required