Sentiment Analysis for Norwegian Text

The SANT project aims to create training data and machine-learned models for Sentiment Analysis for Norwegian Text. While coordinated by the Language Technology Group at IFI/UiO, collaborating partners include NRK, Schibsted and Aller Media.

Sentiment Analysis (SA)

One of the applications of Language Technology (LT) that has gained most widespread use in recent years is so-called opinion mining or sentiment analysis (SA). In broad terms, SA is the task of automatically identifying the opinions, attitudes or emotions that are expressed by subjective information in text.

The goal of SANT is to make industry-scale generic technology for Sentiment Analysis available for Norwegian. To achieve this, SANT will initiate a new collaboration comprising the Language Technology Group (LTG) at the Department of Informatics at the University of Oslo, and three of Norway's largest media groups; the public broadcaster NRK/P3 and the privately held Schibsted Media Group and Aller Media.

Reviews as training data

The SANT project will take advantage of a peculiarity of the way reviews and critiques are typically summarized in Norwegian arts journalism and consumer journalism, viz. by an explicit rating on a scale 1–6, represented as a throw of a die. We here propose to use this feature for semi-automatically compiling a polarity labeled text collection. We can then use this to train and evaluate machine-learned models for sentiment analysis on the document-level. 

Fine-grained SA

For some applications, however, it is desirable to have models that can make more granular predictions at the sentence-level, and additionally identify the targets and holders of the opinions. To enable such models, a subset of the review corpus will therefore be manually annotated with fine-grained in-sentence polarity information.

Neural modeling

In the field of AI in general, and LT in particular, the use of many-layered artificial neural networks (so-called Deep Learning) has recently seen as great revival with many successful applications, including sentiment analysis. The classifiers developed in this project will seek to push the state-of-the-art in large-scale sentiment analysis using deep neural architectures.

Financing

The project is currently granted funding from the RCN's IKTPLUSS initiative for a preliminary trial phase that will run until November 2017.

Tags: Sentiment Analysis, Language Technology, Natural Language Processing, Machine Learning, NLP, AI, Artificial intelligence, deep learning, data science
Published June 12, 2017 11:12 PM - Last modified June 26, 2017 3:08 PM