Are you an academic researcher and into a week of researching and coding on our challenge of creating automated metadata? Please join us during the ICT With Industry workshop in the Lorentz Center Leiden, January 21-25 2019. Problem, food, data and accommodation will be provided for free to selected participants.
The ICT with Industry workshop brings together scientists and professionals from industry and governments. The workshop revolves around five exciting case studies, which are subject to an intense week of analyzing, discussing, and modeling solutions.
You will be part of a team of 5 to 8 dedicated researchers from various ICT-disciplines, actively collaborating on one of the cases during the week.
The case study of the National Library of the Netherlands (KB) focuses on '(Semi-) Automatic Cataloguing of Textual Cultural Heritage Objects’. We have been digitizing our collections at a rapid pace for a number of years now. Large amounts of scans and machine-readable text created from e.g. historical newspapers, periodicals and books are made available to end users through portals such as Delpher. At the same time, the amount of content deposited by publishers or harvested from the web in digital form, such as e-books, e-journals, and web pages, is growing quickly as well.
Rich and accurate descriptive metadata, ranging from title and author on the one hand to specialist scientific subject headings on the other, form an essential prerequisite for enabling users to effectively navigate these collections. The current practice of creating such metadata manually, however, has become prohibitively time-consuming and, in some cases, prone to error. We therefore invite researchers to explore possibilities for automatically extracting relevant metadata from the objects in our digitized and born digital collections, using methods and techniques from the field of Artificial Intelligence – the subfields of Machine Learning and Natural Language Processing in particular.