Political party classification with neural methods
This project will investigate the use of neural architectures for classifying the political party affiliation of parliamentary speeches and/or speakers. The primary data set used will be the Talk of Norway corpus (ToN), comprising Norwegian parliament speeches from 1998 to 2016. Having some knowledge of neural networks will be an advantage when working on this project, e.g. from the courses INF4490 or INF5860.
The project aims to develop neural network based methods to analyze speeches in the parliament. The idea is to employ so-called autoencoders to represent a document as a continuous vector of low dimension, a so-called document embedding. The autoencoder will be a particular type of neural architecture known as a bidirectional LSTM using character/word representations as input. The latter representations will also be embeddings (i.e. low dimensional dense vectors), and can be trained using the python toolkit gensim. The LSTM can be implemented using Keras (also in python).