Text Processing for Linguists (331-0-20)
Instructors
Robert Frederick Voigt Jr
Meeting Info
Annenberg Hall G30: Mon, Wed 9:30AM - 10:50AM
Overview of class
This course offers a practical introduction to programming and the analysis of natural language text as quantitative data. Aimed at linguists, social scientists, and humanists with little-to-no programming background, students will gain hands-on familiarity with Unix command line tools for text processing, basic programming and web scraping in Python, and algorithmic thinking concepts like abstraction and decomposition. We will study how to clean and organize linguistic datasets, and how to apply methods from computational linguistics including regular expressions, syntactic parsing, and vector representations of meaning. The course will conclude with a final project in which students curate and analyze a new dataset.
Registration Requirements
LING 250, 260, or 270, or permission of the instructor.
Learning Objectives
• Practical familiarity with command line text manipulation in Unix and basic programming in the Python programming language.
• Understanding of basic tasks in computational text processing and familiarity with relevant software packages.
• Introductory understanding of algorithmic thinking and problem decomposition for computational analysis of language-related questions.
Teaching Method
Lecture, discussion, small group activities, peer feedback, and assignments.
Evaluation Method
Assignments, participation, final project proposal and completion, self-evaluation.
Class Materials (Required)
Course materials are free, distributed through the class website and Canvas site.
Class Attributes
Formal Studies Distro Area