Using Machine Learning for Data Linking
During data integration, one often requires linking entities between data sources that have been ingested and towards external data sources. For example, in the context of public procurement, mapping entities referred to in procurement data and entities coming from business registries is a significant challenge. In this thesis, you will explore Machine Learning approaches possibly in combination with heuristic and rule-based approaches for linking entities between data sources. Data will often be imperfect due to missing and incorrect information as well as format and language mismatches.