Skip to main content

Text Processing for Linguists (331-0-20)

Instructors

Robert Frederick Voigt Jr

Meeting Info

Annenberg Hall G30: Mon, Wed 9:30AM - 10:50AM

Overview of class

This course offers a practical introduction to programming and the analysis of natural language text as quantitative data. Aimed at linguists, social scientists, and humanists with little-to-no programming background, students will gain hands-on familiarity with Unix command line tools for text processing, basic programming and web scraping in Python, and algorithmic thinking concepts like abstraction and decomposition. We will study how to clean and organize linguistic datasets, and how to apply methods from computational linguistics including regular expressions, syntactic parsing, and vector representations of meaning. The course will conclude with a final project in which students curate and analyze a new dataset.

Registration Requirements

LING 250, 260, or 270, or permission of the instructor.

Learning Objectives

• Practical familiarity with command line text manipulation in Unix and basic programming in the Python programming language.
• Understanding of basic tasks in computational text processing and familiarity with relevant software packages.
• Introductory understanding of algorithmic thinking and problem decomposition for computational analysis of language-related questions.

Teaching Method

Lecture, discussion, small group activities, peer feedback, and assignments.

Evaluation Method

Assignments, participation, final project proposal and completion, self-evaluation.

Class Materials (Required)

Course materials are free, distributed through the class website and Canvas site.

Class Attributes

Formal Studies Distro Area