Statistical Machine Translation (Winter 2018/2019)

General Information

Instructor: Jakub Waszczuk
Course web page:
(This web page, which will be updated throughout the course.)
Office hours: by appointment
Language: English

Course Description

In this course, we will introduce the basic methods of statistical machine translation (SMT), such as word based and phrase based models (in the end, we will also consider more sophisticated methods). A main issue in SMT is not only the models themselves, but rather the estimation of their parameters, hence there is a strong focus on some methods of machine learning.

Format of Course

Every week, there will be one theoretical session where we introduce the main concepts, methods and techniques, and one practical session where we implement them in a subsequently growing program.

We will program in Java, hence some background is required (but there will be a short introduction).

Passing the Course

  • BN: You will need to complete the theoretical and the programming exercises, which we will be working on during the practical sessions. You may also need some additional time at home to finalize and polish your solutions.
  • AP: Just as BN, plus there will be a final written examination a project. The project topics to choose from will be announced at the end of January.


Preliminary schedule:

12 Oct

Introduction and Overview

19 Oct

Probability theory (part I)

Lecture slides, lab session slides
26 Oct

Probability theory (part II)

Lecture slides (handout version), binomial java code

Homework exercises and the accompanying code
2 Nov

No session

Catch-up sessions doodle:

Please fill in the doodle by Wednesday, October 24!
9 Nov

Language models

Bayes’ theorem, parameter estimation (maximum a posteriori, maximum likelihood), and n-gram models (lecture slides)

Homework exercises and the accompanying code

Note: extra session from 16:30 to 18:00

16 Nov

IBM model I

(lecture slides)
23 Nov

IBM model I (continued)

Expectation-Maximization (lecture slides, complementary proofs)

Homework exercises (the part on perplexity updated on 26 November), the accompanying code, and the (partial) expected results

Note: extra session from 16:30 to 18:00
30 Nov

Higher IBM models (2 & 3)

(lecture slides, updated with answers)
7 Dec

Higher IBM models (continued)

Homework 3 feedback, IBM 3 revisited, IBM 4 & 5 (lecture slides, complementary material)

Homework exercises and the accompanying code (deadline: Tuesday, 8 January 2019 (definitive))

Notes on efficient EM
14 Dec

Phrase-based models

IBM 4 and 5 (slides)

Phrase-based translation (slides, phrase extraction algorithm)
21 Dec

Phrase-based models (continued)

Phrase-based translation continued (slides, complementary material)

Supplementary material related to Homework 4: Viterbi for IBM-1, writeMostProbableAlignments2File, writeTransProbTable2File
11 Jan


Theoretical session: decoding, i.e., how to efficiently determine the best translation for a given sentence using a combination of the phrase-based model and the bigram language model (slides, complementary material)

Homework exercises (concerning phrase extraction and parameter estimation within the context of the phrase-based model) and the accompanying code

UPDATE 14/01/2019: the complementary material extended with information on A* decoding
18 Jan


Theoretical session: quality evaluation in SMT (slides, complementary material about Levenshtein alignment)

Practical session: continue working on phrase extraction and phrase translation probability estimation

UPDATE 20/01/2019: added complementary material about the Levenshtein alignment
25 Jan

Current trends in SMT

Theoretical session: selected approaches in neural MT (slides)

Homework exercises (implementation of a simple decoding algorithm) and the accompanying code
01 Feb

Catch-up & project

There will be no lecture, the remaining time will be dedicated to (i) finishing the last practical/theoretical exercises (if needed), and (ii) project presentation and discussions (for those who are interested to do the project and get AP).


Here is a potential project topic, which should give you some idea about how the SMT project can look like (and what are the deliverables).

There are other possible topics and you are encouraged to propose your own project topic.


A large portion of the material available on this page was originally created by Christian Wurm, Miriam Kaeshammer, Simon Petitjean, and Thomas Schoenemann.

This course draws heavily from the Statistical Machine Translation book by Philipp Koehn.