Research Activities

Computational phonology is a young field of study within the old discipline of linguistics, the scientific study of human language. Specifically, phonology asks questions about the inner workings of the sound systems of human languages, such as how sounds combine to form words, how they might change their quality in that process, which abstract concepts and categories should be used to model sounds as cognitively intended gestures and so on. Frequently phonologists are more interested in abstract characterizations of bundles of vocal tract properties that characterize a sound rather than in their detailed physical-acoustic realization (which the leave to the neighbouring discipline of phonetics). For example, a "p" like in "papa" would be characterised by closed lips and a special, wide open state of your vocal folds (behind Adam's apple). Such abstractly characterizable sounds combine to form syllables (that's the part of a sound string you demarcate by tapping to a rhythmical pronounciation of a word - wouldn't you tap twice in pap-pa?!), syllables form words, and so on. All that stuff varies from language to language, and phonologists are interested in the range of this variation and (possibly) deep explanations of what's behind the diversity (and unity) found in the world's languages, apart from looking behind phenomena in individual languages, of course.

Computers can enter this picture when scientists try to create very detailed and precise models of some phenomenon in a given language. While it is very easy to loose sight of what's interacting with what, or, worse, start cheating somehow because you as a language user know the outcome already, computers are helpful in both of these aspects: they know nothing of language unless taught so bit by bit, and they are good at mechanically working out the final consequences of some complicated theory. Which of course often has the form of some weird string of sounds in case of phonology. One could then either look at that string to decide whether the theory has it gotten wrong again (very often, sigh...), or, more sophisticated, try to match things automagically against a corpus of good examples which some language wizard has already entered for us. Still more exciting would be a machine that actually gives an acoustic impression of what the theory's output would sound like , or makes some simulated artificial vocal tract move the lips, tongue and what not (see the papers section for a proposal in that direction). Other people also try the reverse direction: getting from large numbers of good examples to intelligent guesses of a theory or some subpart of it: that's called machine learning, and first applications to phonology already exist.

Now for some more specialist remarks. My main research at the moment involves Computational Prosodic Morphology, carried out mostly within the confines of a framework called Declarative Phonology . The latter has as its major hallmarks a commitment to non-destructive, information-adding operations only (monotonicity), a clear conceptual separation between formal descriptions and the objects in the world so described (intensionality), no distinction between rules and representations (all-constraint-based), and no levels of representation (monostratality).

Currently I'm writing up a substantial tech report on an implemented application of Computational Prosodic Morphology, a model of Modern Hebrew (MH) verbs. This is the outcome of a collaboration with Dafna Meyouhas-Graf, a native speaker of Israeli Hebrew. MH as a Semitic language is said to show so-called non-concatenative morphology. Actually, however, we claim that languages like MH are best modelled by familiar concatenative affixation. What is distinctive is rather the widespread optionality (i.e. V/0 alternation potential, formally modelled by systematic disjunction) of vowels, however confined in large parts to open-class items such as verb stems. Whether any given instance of an optional vowel is realized or not depends on prosodic constraints which regulate the shape of syllables. This leaves a window of indeterminacy for some cases, which we resolve by appealing to a single principle of left-to-right optimization, in that preference is given to omitting optional elements as early as possible (i.e., when no constraint violations forbid their omission). The origins of this principle are likely to come from performance advantages. Prosodic prespecification of some onsets pins down surface shape in phonologically unpredictable cases. Lots of other nifty details are filled in this broad picture to yield a complete analysis of all 145 surface forms in the regular verbal paradigm of Modern Hebrew. We have computer proofs that all and only the correct forms follow from our model, which can also generalize to unknown words, being able to reject a range of impossible Hebrew verbs on the way. (The same broad picture, as applied to an American Indian language called Tonkawa, is already contained in a talk handout written in German, see the papers section).

The PhD thesis contains more computational prosodic morphology still, the language of choice here being Tigrinya.

Finally, another strand of my work involves fiddling with a simple yet quite general approach to Optimality Theory computer implementation which I dubbed OT SIMPLE. More details are found here.

Join channel "computational.phonology"

Markus Walther