Ideal tagset determination for Universal PoS-tags

There has been some recent work on using informational bottlenecks to “squeeze" (eg. part of speech tag) representations, while keeping output accuracy more or less similar. This also means that it is possible to downsize a tagset to essential tags. Doing something like this on Universal Dependencies datasets could give you a cross-lingual notion of what POS tags or dependency relations are essential.

Publisert 8. okt. 2019 12:18 - Sist endret 8. okt. 2019 12:18

