Computational Morphology
Wednesday, 8:30-12:00, 25.41.U1.22
Tentative course schedule
- 11 April -- Cancelled due to the strike!
- 18 April -- Introduction, terminology, recalling theoretical morphology.
Slides Homework: read this chapter about Tamil morphology.
There are two groups: one prepares information about nouns and adjectives and the other - about verbs.
Write me to be assigned to one of them (for those who were not in class). You should prepare a description that
- interesting to the other part of the auditory
- contains a preliminary analysis of the facts (e.g. which types of processes are used)
- identifies places that are interesting/challenging for the analysis.
If you are unable to come on April 25, you can write a short summary (1 page), reflecting
on the above mentioned topics (of course after reading the chapter).
- 25 April -- FSA and FST. Discussing Tamil properties. PDF.
Homework: have a look at one of Tamil grammars at the library (24.21, spr o 700.a825 or spr o 700.r471)
and write a list of 20-30 pairs/triples that illustrate a phenomena (like change of number or tense) and contains all the
morphophonologically different classes. If you are in class, bring your list, if not, send it to me by email by the time of the class.
- 02 May -- Introducing xfst. Regular expressions for xfst. Homework.
- 09 May -- Working on transducing multicharachter labels into affixes and stack ordering (in particular, p. 144 in the book). Homework: Bambona exercise (p. 153 in the book) (don't submit code there...)
- 16 May -- Some more advanced xfst commands. Monish exercise
- 23 May -- Exploring lexc (slides), creating a dictionary. Homework: exercises about Esperanto nouns and adjectives (4.2.8, p. 218-226 in the book or here: nouns
and adjectives)
- 30 May -- Transducers in lexc. Esperanto verbs. Homework: read about Kannada language (phonology and nominal morphology)
- 6 June -- Implementing Kannada phonotactics. Example of reduplication solution.
- 13 June -- Sports day, no class! Homework for June 20: Implement nominal morphology in Kannada (Tags: +m, +f, +pl,
+nom, +gen, +acc, +dat, +loc, +abl, +voc): 2.1 - 2.3.9 + 2.5 - 2.5.3.5. Mark in the text what works and what does not.
- 20 June --
- 27 June --
- 4 July --
- 11 July --
- 19 July (not July 18, change of date!) --
Grading
For both BN and AP:
- Do your homework properly (most of the tasks with sufficient quality).
- Due dates will be announced and published here.
- You can leave you homework at the secretary of send to me by email (email only for programming exercises)
- When you send me something that is related to this class by email, start the title with CompMorph18.
- You homework assignments should be named HW-number-LastName.extension (e.g., HW3-Zinova.fst)
- Homework that is submitted after the due date does not bring you points.
- Up to 3 collaborators can submit a joint homework, indicating all names on the submission (please submit it once per group).
- Works that are obviously completed jointly while this is not indicated will be marked with 0 points.
For an AP:
- Ap is in a form of Hausarbeit
- you will have to describe a piece of morphology using one of the frameworks we will be working with;
- each student doing an AP should be describing a separate piece of morphology (you can work on one language and analyse different phenomena, if you want);
- the area covered by your program should be something that takes around 70 optimal rules;
- to find such a piece, go to the library and study the shelves with grammars of languages you don't know;
- you have to tell about the piece of morphology you have chosen at one of the seminars.
- As a result of you work I expect to receive a script, a set of test examples (with the corresponding set of outputs), and a paper.
- The script has to work for all the cases described by the piece of morphology you aim to cover.
- Your set of test examples should be representative of the data you aim to cover, be sure to check that all the important cases are included and you are not testing exactly the same combination of rules multiple times (unless you provide an automated testing script that checks the output).
- In the paper you should describe the facts that you are modeling, the choices you had to make while writing the program (e.g., the ordering of rules and the selection of the formalism), the testing phase, and (optional) the material that you are aware of, but your program does not cover for good reasons.
Grades:
- The description part is worth 30 points, the script part -- 60 points, the set of testing examples -- 10 points;
- Grade/points correspondence:
- 1.0: 95 -- 100
- 1.3: 91 -- 94
- 1.7: 87 -- 90
- 2.0: 83 -- 86
- 2.3: 80 -- 82
- 2.7: 75 -- 79
- 3.0: 70 -- 74
- 3.3: 65 -- 69
- 3.7: 60 -- 65
- 4.0: 50 -- 59