Semantics 2017: Improving Access to Digital Content by Semantic Enrichment

Introductie

Juliette Lonij will present the work she does with Theo van Veen on linking named entities in the KB's digitised newspaper collection at the Semantics conference on 12 September 2017.

Inhoudsblokken
Body

Abstract

The collection of digitized historical newspapers of the National Library of the Netherlands contains an abundance of information about events, persons, concepts etc. As part of our effort to automatically extract this information from the unstructured text we are developing methods to recognize named entities and link them to external knowledge bases such as DBpedia and Wikidata.

We are continuously working on further increasing the accuracy of the links by exploring new machine learning algorithms and adding new features. Our current focus is on identifying and incorporating features that may play a role in human entity linking, and on applying word and entity embeddings, as we expect these representations to be a valuable addition to our existing, mostly handcrafted features.