Deep Learning in NLP (Winter 2019/2020)
|Theoretical sessions:||Monday, 14:30 – 16:00, room 24.21.05.61|
|Practical sessions:||Tuesday, 14:30 – 16:00, room 24.21.03.62-64|
|Course web page:||https://user.phil.hhu.de/~waszczuk/teaching/hhu-dl-wi19/|
|(This web page, which will be updated throughout the course.)|
|Office hours:||by appointment|
|Languages:||German and English|
The aim of this course is to understand the state-of-the-art techniques of neural networks and to apply them in practice, to natual language processing problems in particular.
Monday sessions will be typically dedicated to theory, Tuesday sessions – programming. During the practical sessions, we will mostly use the PyTorch framework to implement our networks. Instructions on how to install all the necessesary tools on Ubuntu are here.
The theoretical content can be found in the script (caution, frequent updates!). Last updated: January 20, 2020.
- BN: Complete the theoretical and the programming homework exercises. The homeworks will be published on this web page as we go.
AP: Term paper: 4-5 pages for undergrad students, 7-10 pages for master students. Please use the ACL 2020 stylesheet (Latex, Word). You can pick a topic of your choice, it does not necessarily have to be NLP-related. Documented code and running/installation instructions make part of the deliverables.
- UPDATE 14.02: The code should be documented the standard way: you should provide docstrings/comments which explain what it does and why, especially for more complicated chunks of code.
Preliminary schedule of the practical sessions:
Introduction and Overview
Recall how to program in Python and get familiar with the development environment (VSCode, IPython).
General feedback and solution to the first homework.
|21 Oct||Vektoren Matrizen|
Basic end-to-end example
An end-to-end example of applying a neural network to a simple classification task. We will implement a feed-forward network using the basic PyTorch primitives.
Additional material on using tensors in PyTorch.
General feedback and solution to the second homework.
|28 Oct||Theoretical homework and the corresponding solution.|
Language classification (I): Application design
Tackle the simple NLP task of classifying person names according to their language (English, German, French, …). Implement a couple of higher-level modules/classes on top of the basic primitives provided by the PyTorch framework, which will allow us to build more complex deep learning models.
(Link to the theoretical homework moved up, see 28 Oct)
|4 Nov||Lineare Separierbarkeit|
Application design continued
|11 Nov||Theoretical homework and the corresponding solution.|
Stochastic gradient descent
Implement stochastic gradient descent, learn about PyTorch optimizers (e.g. Adam).
Full train/dev/test split. With SGD, we should be able to train on the entire training set. We have already seen the dev part (dev80.csv + dev20.csv). You should avoid doing any experiments with test.csv yet.
Batching is the technique of specifying neural computations over batches (i.e., sets) of dataset elements. It allows for better parallelization and, hence, faster computations.
Backpropagation (practical aspects)
POS tagging (I): Embedding
The code (without the dataset) is also on github.
POS tagging (II): Scoring and Training
POS tagging (III): Training and LSTMs
POS tagging (IV): pre-trained word embeddings + dropout
Finalize the implementation of the POS tagger.
Note that the code contains certain optimizations implemented during the break. These optimizations speed up training without changing the underlying model (the accuracy should be roughly the same).
Dependency parsing (I)
Version of the code after the session (WARNING: this is the last time the code is published on this webpage, make sure to keep track of your own code from now on!)
Dependency parsing (II)
Project proposal presentations
Dependency parsing (III)
Dependency-aware loss function (click raw at the top of the page to copy-and-paste the code)
Some topics we may consider later on:
- Structured prediction
- Regularization (dropout)
- ,,Recursive’’ (tree-structured) networks
- Language modeling with neural networks
- Unsupervised learning of word embeddings
- Multi-task learning
- Neural machine translation