Ling 681: Statistical Methods in Natural Language Processing (Fall 2006)
| Course type | Lecture |
|---|---|
| Instructor | Rob Malouf |
| Time | MWF 13:00–13:50 |
| Location | BA 412 |
Requirements
The final grade will be based on homework assignments (20%), a take-home midterm exam (30%), and a final project (50%).
Through the term, there will be occasional homework assignments to practice the techniques learned in class. Working in groups is encouraged, but please include the names of all coworkers on the assignment.
The final project for this course will be a group project to design, implement, document, and evaluate an NLP application based on the statistical methods covered in the course.
Readings
The required textbooks for this course are:
Christopher D. Manning and Hinrich Schütze. 1999. Foundations of Statistical Natural Language Processing. MIT Press. http://nlp.stanford.edu/fsnlp/and
Ian H. Wittien and Eibe Frank. 2005. Data Mining: Practical Machine Learning Tools and Techniques. Second edition. Elsevier. http://www.cs.waikato.ac.nz/∼ml/weka/book.html
They are for sale in the campus bookstore and at Amazon, etc. Updates and corrections to the first book can be downloaded from the authors' websites.
Additional readings will be made available in class or via the "Resources" section of the course web page
Lab
For homework assignments and final projects, we will be using the computational linguistics lab, part of the Social Sciences Research Lab in the basement of the Professional Services and Fine Arts building. Information about how to use the lab will be made available before the first assignment.
Schedule
- Week 1–3 Introduction
Background · Mathematical background · Probability · Information Theory - Week 4–6 Statistics
Descriptive statistics · Hypothesis testing · Corpus statistics - Week 7–8 Context-free grammars
Probabilistic context free grammars · Inside-Outside algorithm · Treebank grammars · Dependency-based models - Week 9–10 Attribute-value grammars
Unification · Maximum entropy · Parameter estimation · Parsing - Week 11–12 Machine learning
Word sense disambiguation · Machine learning algorithms · Evaluation - Week 13–14 Text classification
Clustering · Classification · Advanced algorithms · Rainbow · Weka