Personal tools
You are here: Home Members rmalouf Courses Ling 795: Analyzing Web Texts (Spring 2006)
Document Actions

Ling 795: Analyzing Web Texts (Spring 2006)

by Rob Malouf last modified 2006-11-14 10:38
Web text, such as blogs, newsgroups, message boards, and email lists, can provide an easily collected and incredibly rich source of data on a nearly limitless range of topics. However, the sheer quantity of data makes comprehensive qualitative analysis impossible, and the nature of web texts present a set of unique challenges for standard computational methods. In this seminar, we will investigate web texts as a distinct text type (or types), looking at the linguistic and extra-linguistic properties that make them unique. We will also explore some of the data-intensive methods that can be used to extract useful information from large, noisy collections of web texts.
Available resources
Semester Spring 2006
Course type Seminar
Instructor Rob Malouf
Time T 19:00–21:40
Location AH 2132

Requirements

The goals of this course are for us to gain experience in:

  • exploring the state of the art of linguistically motivated techniques for analyzing web texts,
  • reading and evaluating the primary literature,
  • presenting and discussing research material with peers,
  • identifying open research questions,
  • and designing and carrying out our own experiments.

Through the term, participants (including auditors!) will present and discuss articles from the reading list, which cover a number of aspects of text and web mining.

In addition to leading and participating in discussions, students taking the class for a grade will also prepare a final project. Projects should somehow involve web texts and NLP, but need not be restricted to the methods we cover in class. Ideally, the final project should be something that could be submitted to one of the many computational linguistics conferences.

The final grade will be based on class participation and on a project that applies text mining technology to a useful and interesting problem:

Project proposal (<1 page)Feb 2810%
Annotated bibliographyMarch 2110%
Data setApril 410%
Final projectMay 1850%
Class participation20%

Working in groups (of 2 or 3) is strongly encouraged!


Powered by Plone CMS, the Open Source Content Management System

This site conforms to the following standards: