Seminar (Laura Kallmeyer)
Monday 10.30-12.00 and Tuesday 10.30-12.00.
Start: 20.04.2020. Last session: 14.07.2020.
Due to the current situation concerning Corona, the seminar will be taught as an online seminar. More information can be found here on the Moodle course page.
Parsing is a central task in natural language processing. Its goal is to compute the syntactic structures of sentences. Such a syntactic structure could either be a constituency structure or a dependency structure. The former is in many cases taken to be generated by a context-free grammar (CFG). Consequently, constituency parsing amounts to a) implementing/inducing a context-free grammar and b) using this grammar for parsing. Dependency parsing, in contrast to this, is mostly grammar-less parsing using machine-learning techniques.
In this course, we will mainly concentrate on step b) of CFG-based constituency parsing. We will revise various symbolic parsing algorithms that yield, given a CFG and an input sentence, the set of all parse trees for this sentence. In the second half of the course, we will move on to probabilistic parsing, covering Viterbi parsing and weighted deductive parsing with A* estimates.
For references see the slides of the individual sessions.
Schedule and Slides
(Some of the slides are from previous years.)
- 20.-21.04.20 Introduction, Context-free grammars (CFG)
- 27.-28.04.20 CFG continued, Push-Down Automata (PDA)
- 04-05.05.20 Unger’s Parser.
Example of a trace for Unger’s parser.
An implementation of Unger’s Parser (by Simon Petitjean) can be found here.
Top-down Parsing (LL-Parsing).
- 11.-12.05.20 Parsing as Deduction.
An example of agenda-based parsing can be found here.
- 15.05., 19.05-20CYK Parsing. Achtung: die Vorlesung von Montag ist auf den Freitag davor (15.5. statt 18.5.) vorverlegt.
- 25.-26.05.20 Shift Reduce Parsing, LL(1) Parsing
- 01.06.20 holiday (Pfingstmontag)
- 02.06.20 LL(1) Parsing continued
- 08.-09.06.20 Left Corner Parsing
- 15.-16.06.20 Earley Parsing
- 22.-23.06.20 LR Parsing, Tomita
- 29.-30.06.20 PCFG, Inside and outside, Viterbi
- 06.-07.07.20 Treebank grammars, Weighted deductive parsing
- 13.-14.07.20 Weighted deductive parsing continued, A* parsing
- 14.07.20 Transition-based constituency parsing: Kenji Sagae and Alon Lavie A Classifier-Based Parser with Linear Run-Time Complexity , IWPT 2005.
There are weekly exercises for the course. These exercises are obligatory, they have to be handed in via Moodle. The solutions of the exercises will be discussed in the course.
- parsing-homework-cfg1 (with solutions), due April 27th 2020.
- parsing-homework-cfg2 (with solution), due May 4th 2020.
- parsing-homework-top-down, (with solution), due May 11th 2020.
- parsing-homework-deduction, (with solution), due May 18th 2020.
- parsing-homework-cyk, (with solution), due May 25th 2020.
- parsing-homework-shift-reduce (with solution), due June 2nd 2020.
- parsing-homework-ll1, (with solution), due June 8th 2020.
- parsing-homework-left-corner-earley, (with solution), due June 15th 2020.
- parsing-homework-lr, (with solution), due June 22th 2020.
- parsing-homework-tomita, (with solution), due June 29th 2020.
- parsing-homework-treebank-grammars, (with solution), due July 06th, 2020.
- parsing-homework-weighted-deductive, (with solution), due July 13th, 2020.
Sowohl für einen BN als auch für eine AP müssen mindestens 9 der 12 Hausaufgabenblätter bearbeitet und sinnvoll gelöst werden. Gruppenarbeit (max. Gruppengröße ist 3) ist erlaubt, bitte alle Namen auf die Abgabe.
Für eine AP kann zusätzlich eine individuelle Prüfung vereinbart werden, zum Beispiel eine mündliche Prüfung.
(Für Studierende des BA Computerlinguistik integrativ ist Parsing eine Grundveranstaltung in CL5, hier ist keine AP vorgesehen.)