Identity Resolution (IR) is the process of identifying which entities mentioned within one or several datasources, are actually identical. Examples include
- identification of identical media files on one server,
- identification of individuals between a customer database and an employee database,
- identification of entities between public information sources like musicbrainz, Wikipedia, etc.
From a semantic technology standpoint, IR can be used to establish a dataset of “owl:sameAs” triples. But existing tools (local and federated query engines) don't currently make use of such identity information.
This thesis is about
- evaluating existing IR technology for establishing identification data sets.
- enhancing existing query answering technology to make use of identity information.
Work may be carried out in connection with the SIRIUS centre (www.sirius-labs.no), giving opportunities to interact with various industry partners.