Deep Learning in NLP (Winter 2019/2020)
Quick Links
General Information
Instructors: | Christian Wurm |
Jakub Waszczuk | |
Theoretical sessions: | Monday, 14:30 – 16:00, room 24.21.05.61 |
Practical sessions: | Tuesday, 14:30 – 16:00, room 24.21.03.62-64 |
Course web page: | https://user.phil.hhu.de/~waszczuk/teaching/hhu-dl-wi19/ |
(This web page, which will be updated throughout the course.) | |
Office hours: | by appointment |
Languages: | German and English |
Course Description
The aim of this course is to understand the state-of-the-art techniques of neural networks and to apply them in practice, to natual language processing problems in particular.
Monday sessions will be typically dedicated to theory, Tuesday sessions – programming. During the practical sessions, we will mostly use the PyTorch framework to implement our networks. Instructions on how to install all the necessesary tools on Ubuntu are here.
Script
The theoretical content can be found in the script (caution, frequent updates!). Last updated: January 20, 2020.
Requirements
- BN: Complete the theoretical and the programming homework exercises. The homeworks will be published on this web page as we go.
-
AP: Term paper: 4-5 pages for undergrad students, 7-10 pages for master students. Please use the ACL 2020 stylesheet (Latex, Word). You can pick a topic of your choice, it does not necessarily have to be NLP-related. Documented code and running/installation instructions make part of the deliverables.
- UPDATE 14.02: The code should be documented the standard way: you should provide docstrings/comments which explain what it does and why, especially for more complicated chunks of code.
Schedule
Preliminary schedule of the practical sessions:
8 Oct |
Introduction and Overview |
15 Oct |
Python refresher Recall how to program in Python and get familiar with the development environment (VSCode, IPython). Python refresher and some hints on using VSCode and IPython. Homework exercise (updated on 15/10), the partial solution (unpack it first), and the dataset with person names. General feedback and solution to the first homework. |
21 Oct | Vektoren Matrizen |
22 Oct |
Basic end-to-end example An end-to-end example of applying a neural network to a simple classification task. We will implement a feed-forward network using the basic PyTorch primitives. Homework and the corresponding code (unpack it first). Additional material on using tensors in PyTorch. General feedback and solution to the second homework. |
28 Oct | Theoretical homework and the corresponding solution. |
29 Oct |
Language classification (I): Application design Tackle the simple NLP task of classifying person names according to their language (English, German, French, …). Implement a couple of higher-level modules/classes on top of the basic primitives provided by the PyTorch framework, which will allow us to build more complex deep learning models. Practical session, the corresponding code (original), as well as the version we worked on during the class. (Link to the theoretical homework moved up, see 28 Oct) |
4 Nov | Lineare Separierbarkeit |
5 Nov |
Application design continued We continue working on the practical session. Download the partial solution (zip) as we left it last week. The additional explanations on github may also be helpful. Homework (updated on 11/11), based on the practical session. The solution can be found on github. |
11 Nov |
Das einfache Neuron und Tiefe Architekturen Theoretical homework and the corresponding solution. |
12 Nov |
Stochastic gradient descent Implement stochastic gradient descent, learn about PyTorch optimizers (e.g. Adam). Full train/dev/test split. With SGD, we should be able to train on the entire training set. We have already seen the dev part (dev80.csv + dev20.csv). You should avoid doing any experiments with test.csv yet. |
19 Nov |
Batching Batching is the technique of specifying neural computations over batches (i.e., sets) of dataset elements. It allows for better parallelization and, hence, faster computations. Homework, the corresponding code, and the solution. Additionally, Ex. 1 solution notes. |
25 Nov |
Backpropagation (theory) |
26 Nov |
Backpropagation (practical aspects) |
3 Dec |
POS tagging (I): Embedding Exercises and the corresponding code (also includes the UD dataset sample). The code (without the dataset) is also on github. UPDATE: version of the code with modifications implemented during the session. |
9 Dec |
LSTM (theory) |
10 Dec |
POS tagging (II): Scoring and Training Exercises and the corresponding code (also includes the UD dataset sample). UPDATE: version of the code with modifications implemented during the session. |
17 Dec |
POS tagging (III): Training and LSTMs Exercises and the corresponding code (also includes the UD dataset sample). UPDATE: version of the code with modifications implemented during the session. UPDATE 31.12.2019: optimized version of the code. See also the description of the optimization steps. |
7 Jan |
POS tagging (IV): pre-trained word embeddings + dropout Finalize the implementation of the POS tagger. Exercises and the corresponding code (also includes the UD dataset sample). Note that the code contains certain optimizations implemented during the break. These optimizations speed up training without changing the underlying model (the accuracy should be roughly the same). English fastText word vectors: the 10^5 most frequent words and the words present in the English UD treebank (both files are based on wiki-news-300d-1M-subword.vec.zip). UPDATE: version of the code after the session (with the fastText vectors included). |
14 Jan |
Dependency parsing (I) Task description (updated on 21.01.2020) and the corresponding code (includes the UD dataset sample and the fastText vectors). Version of the code after the session (WARNING: this is the last time the code is published on this webpage, make sure to keep track of your own code from now on!) |
21 Jan |
Dependency parsing (II) |
27 Jan |
Project proposal presentations |
28 Jan |
Dependency parsing (III) Dependency-aware loss function (click raw at the top of the page to copy-and-paste the code) |
Some topics we may consider later on:
- Self-attention
- Structured prediction
- Regularization (dropout)
- ,,Recursive’’ (tree-structured) networks
- Language modeling with neural networks
- Unsupervised learning of word embeddings
- Multi-task learning
- Neural machine translation