Dialogue Act Recognition using Dependency Features
Spoken Dialogue Systems often comprise many different interconnected modules, each being responsible for a specific part of the processing pipeline. One particularly important subtask is Dialogue Act Recognition (DAR). The objective of dialogue act recognition is to map a given spoken input (typically the generated result of a speech recognizer) to its most likely dialogue act. For instance, the human user might utter "robot, now please go to the kitchen", and the aim of DAR is to map this utterance to a dialogue act such as "Command(GoTo(Kitchen))". Similarly, an utterance such as "oh what's your name?" can be mapped to a dialogue act represented as "Ask(WhoAreYou)".
Dialogue act recognition is challenging for many respects. First, there can be many alternative realisations for the same dialogue act -- the user can phrase its intention in many different ways. Second, the input of the DAR module is the raw utterance as decoded by the speech recognizer, and is thus likely to contain speech disfluencies, fragments and recognition errors. Finally, various types of linguistic and pragmatic ambiguities can occur. The input-output mapping is therefore particularly complex, and machine learning techniques are often necessary to obtain reasonably accurate results.
In order to train a dialogue act recognizer using machine learning, we need to define a feature set, representing relevant information extracted from the input. In most present-day systems, the features which are used for the recognition rely on rather simple, shallow measures, such as the presence/absence of specific words or templates. The contribution of this thesis would be to develop a more linguistically informed approach, by augmenting the feature set with syntactic features extracted from the dependency structure of the utterance. The dependency structure can be generated by a existing dependency parser such as the MaltParser. The core idea is that such dependency features might provide a more abstract (i.e. closer to semantics) perspective on the user utterance, and therefore hopefully a more accurate recognition of the dialogue act.
The thesis will involve both theoretical and practical work. The student will design and implement an algorithm for automatically extracting the relevant syntactic features of an utterance, and train a probabilistic model relying on these features to automatically recognize dialogue acts. The method will finally be evaluated on a small corpus of spoken utterances.
Prerequisites: knowledge of parsing algorithms & machine learning approaches to NLP, as well as programming skills in Java, C++ or Python. If you are interested in this thesis topic, simply drop us an email!