Doctoral candidate Andrey Kutuzov at the Department of Informatics, Faculty of Mathematics and Natural Sciences, is defending the thesis Distributional word embeddings in modeling diachronic semantic change for the degree of Philosophiae Doctor.

Trial lecture

"Word Sense Induction"

Main research findings

This thesis studies how distributional vector representations capture changes in lexical meaning. In natural human languages, words change what they mean over time. These diachronic semantic shifts can be detected automatically. We do this by analyzing changes in the behavior of large-scale neural language models trained on texts created in different time periods.

Distributional semantic models based on dense vector representations (word embeddings) efficiently capture many aspects of word meaning. As such, they are extremely important for natural language processing systems which are aimed at understanding and generating human language. However, they are mostly applied to language data without taking temporal drift into account: in a synchronic way.

We move on to the diachronic realm and employ word embeddings to achieve unsupervised data-driven detection of temporal semantic change. We train diachronic models in different ways, and devise methods to solve the task of detecting how words change their meaning and usage over time. The findings in this thesis are important both for general linguistics and for practical applications like web search and digital humanities.



