Compact Course:
Formal Language Theory and the complexity of natural languages

Date: Riga, 6th and 7th december 2006
Lecturer: Wiebke Petersen
Telephone: 81-15295


The course will be split into two parts (which can be attended separately, although the attendance of the whole course is recommended):
(1) An introduction to automata and Formal Language Theory
(2) The complexity of natural languages

The first part gives an introduction to Formal Language Theory for students with little or no background in formal systems. The topics to be covered include: regular languages and regular expressions; finite state automata; context free grammars and languages; pushdown automata; the Chomsky hierarchy; weak and strong generative capacity; and grammar equivalence.

We will concentrate on those formal languages that can be generated by formal grammars and recognized by automata. The Chomsky hierarchy classifies those languages hierarchically according to the structural complexity of their generating grammars. In 1957 Chomsky raised the question where natural languages (seen as formal languages) have to be placed in the hierarchy. From the viewpoint of computational linguistics this question is central, since its answer determines the choice of appropriate computational formalisms for natural language processing.

The second part of the course gives an historical overview of the dispute about the formal complexity of natural languages, discusses the arguments, and closes with the surprising result that Suisse German is more complex than most analyzed languages. Additionally, a short glimpse of mildly context-sensitive languages and tree adjoining grammars is provided.