Seminar (Laura Kallmeyer)
Monday 10.30-12.00 and Tuesday 10.30-12.00.
Start: 12.04.2021. Last session: 20.07.2020.
Due to the current situation concerning Corona, the seminar will be taught as an online seminar. More information can be found here on the Moodle course page.
Course description
Parsing is a central task in natural language processing. Its goal is to compute the syntactic structures of sentences. Such a syntactic structure could either be a constituency structure or a dependency structure. The former is in many cases taken to be generated by a context-free grammar (CFG). Consequently, constituency parsing amounts to a) implementing/inducing a context-free grammar and b) using this grammar for parsing. Dependency parsing, in contrast to this, is mostly grammar-less parsing using machine-learning techniques.
In this course, we will mainly concentrate on step b) of CFG-based constituency parsing. We will revise various symbolic parsing algorithms that yield, given a CFG and an input sentence, the set of all parse trees for this sentence. In the second half of the course, we will move on to probabilistic parsing, covering Viterbi parsing and weighted deductive parsing with A* estimates.
For references see the slides of the individual sessions.
Schedule and Slides
(Some of the slides are from previous years.)
- 12..04.21 Introduction
- 13.04.21 Context-free grammars (CFG)
- 19.04.21 CFG continued
- 20.04.21 Push-Down Automata (PDA)
- 26.04.21 Unger’s Parser.
Example of a trace for Unger’s parser.
An implementation of Unger’s Parser (by Simon Petitjean) can be found here. - 27.04.21 Top-down Parsing (LL-Parsing).
- 03.05.21 Parsing as Deduction.
An example of agenda-based parsing can be found here. - 04.05.21 Parsing as Deduction continued.
- 10.05.21 CYK Parsing
- 11.05.21 CYK continued
- 17.05.21 Shift reduce parsing, LL(1) Parsing
- 18.05.21 LL(1) Parsing continued
- 24.05.21 fällt aus (Pfingsten)
- 25.05.21 Left Corner Parsing
- 31.05.21 Left Corner Parsing continued
- 01.06.21 Earley Parsing
- 07.06.21 LR Parsing
- 08.06.21 LR continued
- 14.06.21 Tomita
- 15.06.21 PCFG, Inside and outside, Viterbi
- 21.06.21 Treebank grammars
- 22.06.21 Treebank grammar continued
- 28.06.21 Weighted deductive parsing
- 29.06.21 Weighted deductive parsing continued. See here for an exercise from last year that extends the Earley deduction rules with weights.
- 05.07.21 A* parsing
- 06.07.21 Vertretung: Martin Zakrezewski
- 12.07.21 A* parsing continued. Example for k-best parsing. And a second example for k-best parsing. a second example for k-best parsing
- 13.07.21 Transition-based neural constituency parsing. Literatur: Sagae and Lavie A Classifier-Based Parser with Linear Run-Time Complexity, Dyer et al. Recurrent Neural Network Grammars,. Liu, and Zhang In-OrderTransition-based Constituent Parsing
- 19.07.21 Transition-based parsing continued. Hybrid, in Hörsaal 25.31.HS 5J (24 Corona-Plätze). Details in Moodle.
- 20.07.21 fällt aus.
Homework
There are weekly exercises for the course. These exercises are obligatory, they have to be handed in via Moodle. The solutions of the exercises will be discussed in the course.
- Homework CFG, due 26.04.21 (with solution)
- Homework Top-Down Parsing, due 03.05.21 (with solution)
- Homework Parsing as Deduction, due 10.05.21 (with solution)
- Homework CYK, due 17.05.21 (with solution)
- Homework CYK and Shift-reduce, due Tuesday 25.05.21 (with solution)
- Homework LL(1) due Monday 31.05.21 (with solution)
- Homework Left-Corner, Earley, due Monday 07.06.21 (with solution)
- Homework LR Parsing, due Monday 14.06.21 (with solution)
- Homework Tomita, due Monday 21.06.21 (with solution)
- Homework PCFG, due Monday 28.06.21 (with solution)
- Homework EM and weighted deductive parsing, due Monday 05.07.21 (with solution)
- Homework A* Parsing, due Monday 12.07.21
Homework
There are weekly exercises for the course. These exercises are obligatory, they have to be handed in via Moodle. The solutions of the exercises will be discussed in the course.
Leistungsnachweise
Sowohl für einen BN als auch für eine AP müssen mindestens 9 der 12 Hausaufgabenblätter bearbeitet und sinnvoll gelöst werden. Gruppenarbeit (max. Gruppengröße ist 3) ist erlaubt, bitte alle Namen auf die Abgabe.
Für eine AP kann zusätzlich eine individuelle Prüfung vereinbart werden, zum Beispiel eine mündliche Prüfung.
(Für Studierende des BA Computerlinguistik integrativ ist Parsing eine Grundveranstaltung in CL5, hier ist keine AP vorgesehen.)