Vera Provatorova was a researcher-in-residence at the KB in 2023.
Her project 'Towards robust entity linking and disambiguation on Dutch historical documents' explored the question: to what extent is entity linking on Dutch-language archival data affected by entity overshadowing, and how can we make EL systems robust against it?
Project
Entity linking can enrich a dataset by connecting it to entities in a structured knowledge base. This process consists of identifying named entities (NE’s), such as names of people, places, organizations etc., in a text, disambiguating these NE’s and connecting them to an entity in a knowledge base, such as Wikidata. Dutch historical texts are especially challenging for automatic entity linking due to the language and the often less-than-perfect OCR quality.
Vera Provatorova attempted to tackle this challenge of entity linking DBNL to Wikidata, as described in her blogpost.