Extending the Norwegian wordnet
A wordnet for Norwegian (NWN) has recently been developed, with 50.000 synsets and roughly 90 per cent coverage of the senses in running newspaper text (given correct tagging and compound segmentation). The resource is freely available under an open source licence. To be really useful in applications, however, it should be extended. There are various methods available to do this:
1) Merging with NorNett. Another wordnet has been developed at the department of Nordic studies, UiO. This resource is now been released for wider use, and could be merged with NWN. Since there is significant overlap, this is a non-trivial task.
2) Using large corpora, with vector space models or pattern-based approaches.
If you are interested, please contact Lilja Øvrelid. The thesis will be co-supervised with Lars Nygaard from Kaldera Språkteknologi, the creator of NWN.
Publisert 1. okt. 2013 08:28
- Sist endret 10. feb. 2015 10:29