Topics in Statistics (359-0-20)
Topic
Large Language Models
Instructors
Lizhen Shi
Meeting Info
Harris Hall L07: Mon, Wed 2:00PM - 3:20PM
Overview of class
Topic: Large Language Models
This course provides a comprehensive introduction to large language models (LLMs) and the foundational transformer architecture that powers them. Students will explore core principles, mathematical foundations, and key innovations behind transformers. The course traces the evolution of word embeddings and NLP models — from word2vec, to RNN-based models, to encoder-decoder architectures with attention, and on to state-of-the-art LLM systems like GPT and beyond. While transformers represent a major breakthrough and the current state of the art, they are not the endpoint of this journey. This course will help students build a solid foundation for understanding the ongoing evolution of LLMs and prepare them to stay current in this rapidly advancing field.
Registration Requirements
For Undergraduate Students:
- Completion of either the Python sequence or the R sequence (STAT 303-1,2,3 or STAT 301-1,2,3)
- Students coming from the R sequence should be comfortable with Python programming, as the course materials and assignments will primarily use Python.
For Graduate Students:
- A solid understanding of traditional machine learning concepts (e.g., regression, classification, model evaluation).
- Proficiency in Python programming for data analysis and model implementation.
Learning Objectives
By the end of this course, students will be able to:
- Explain the evolution of large language models (LLMs) — from early word embeddings and RNN-based architectures to modern transformer-based systems — and their impact on natural language processing tasks.
- Understand the core principles of transformer architecture, including self-attention, positional encoding, multi-head attention, and feedforward components.
- Implement and experiment with transformer-based models using PyTorch for applications such as text generation, classification, and fine-tuning.
- Build a strong conceptual foundation for understanding emerging architectures beyond transformers and staying current with advances in the rapidly evolving field of large language models.
Teaching Method
The primary teaching method will be lectures
Evaluation Method
1) Homework assignments, 2) Final Project, 3) Participation
Class Materials (Required)
Course materials will be distributed via Canvas
Class Attributes
Formal Studies Distro Area