Named Entity Recognition for Conflict Research
This project will focus on domain adaptation of systems for Named Entity Recognition. NER is the task of identifying named entities in text and labeling them with respect to given categories like person, location, organization, etc. While off-the-shelf machine-learned NER classifiers are readily available, this project will seek to adapt them to the particular domain of conflict research.
More specifically, we will be working with data about armed conflict gathered by the Uppsala Conflict Data Program (UCDP). This data pairs news texts with manually coded information about violent political events worldwide (actors / sides in the conflict, number of casualties, etc.). Having access to a domain-adapted NER system would be beneficial for automating this event coding. The UCDP also maintains so-called actor lists, effectively dictionaries or synonymy lists for names of relevant actors in known conflicts, which could potentially be included as domain-specific gazetteers in an otherwise general-domain NER model.