Compact Course:
Introduction to Computational Linguistics
Date: Riga, 28 November - 1 Dezember 2008
Lecturer: Wiebke Petersen (
Pawel Sirotkin
Announcement (in Latvian)
Computational Linguistics (CL) is concerned with the computational aspects of human languages. It is an interdisciplinary field between linguistics and computer science. The aim of CL is to provide computational models of various kinds of linguistic phenomena in order to get computers to perform various tasks with human languages, i.e., to use natural language as input, output, or both. Possible topics include parsing, grammar induction, information retrieval, and machine translation. The course will be concerned with both the formal foundations of CL and their practical applications.The following subjects will be covered (provisional):
- overview of CL
- automatons and grammars
- context free grammars and parsing
- statistical methods
- part-of-speech tagging
- information retrieval
The courses will be at beginners level.
Slides
- Friday: General Remarks (Wiebke; 60 KB, PDF)
- all days: Introduction, Formal Languages, Finite State Automatons, Prolog, Context-Free Grammars, Parsing (Wiebke; 2.600 KB, PDF)
- Friday: Introduction to Linguistics (Pawel; 580 KB, PowerPoint)
- Saturday: Finite State Transducers (Wiebke; 315 KB, PDF)
- Saturday: Part-of-Speech Tagging (Pawel; 980 KB, PowerPoint)
- Sunday: Introduction to Perl (Pawel; 880 KB, PDF)
Software
- JFLAP
- Exorciser (unfortunately, only German version)
- Active Perl
Links
- Question answering:
- Machine translation:
- Part-of-Speech Taggers
- CST (English, Russian, German...)
- Cognitive Communication Group (English)
- CLAWS (English)
- Prolog
- Learn Prolog now (good introduction to Prolog)
- Xerox tools (many tools for text manipulation, FST's,...)
Literature
- Daniel Jurafsky & James H. Martin. Speech and Language Processing. Prentice Hall, 2nd edition, 2008
- Christopher D. Manning and Hinrich Schütze. Foundations of Statistical Natural Language Processing. MIT Press, Boston, MA, 1999
- Barbara H. Partee, Alice ter Meulen & Robert E. Wall. Mathematical Methods in Linguistics. Kluwer Academic Publishers, 1990.