Oppgaven er ikke lenger tilgjengelig

Fine-grained annotation of sentiment for low-resource languages

Supervised machine learning models have become the default paradigm in sentiment analysis, leading to state-of-the-art results on most tasks. However, the best results within this framework presuppose the existence of large amounts of
high-quality annotated data, which is often lacking in under-resourced languages, e.g. Norwegian. Annotation projects require a great deal of time, money, and effort in order to create high-quality resources. Therefore, finding techniques that minimize these requirements would be a boon to the creation of annotated data in
under-resourced languages.

This thesis will contribute by comparing small amounts of annotation, annotation projection, and cross-lingual annotation methods in order to determine which technique is most suitable and under which circumstances. The precise details and scope of the thesis will be further decided in agreement between the supervisors and the candidate.

The project presupposes good programming skills, experience with machine learning and a solid background in NLP. Please contact the supervisors to discuss further details.


Emneord: sentiment analysis, machine learning, NLP, low resource languages
Publisert 18. okt. 2018 10:16 - Sist endret 19. aug. 2019 14:43


Omfang (studiepoeng)