Datasets

Accessible e-books and audiobooks

There is a growing interest in making e-books and digital content services accessible. We asked specialists

Ot & Sien dataset

Introduction The goal of this dataset is to provide training and evaluation data for the development

Is your OCR good enough?

Description This webpage contains information about the datasets used and code developed as part of the

Assisted keyword assignment using Annif

Annif can be used to make cataloging more efficient by suggesting authors and keywords.

RDA Entity Finder

The RDA Entity Finder enables you to browse through the bibliographic Work, Expression, Manifestation and Item

Entangled Histories: Ordinances of the Low Countries

This special collection Entangled Histories: Ordinances of the Low Countries is made up of 108 books

CanOT

CanOT shows the canonicity curves of publications related to 21 selected authors or texts/figures.

KB Lab Bot

A Facebook Messenger Bot to retrieve cultural heritage masterpieces & code to build your own chatbot.

xportal

A simple search interface on Delpher data used for testing and demonstration.

Frame generator

Tool for extracting topics, keywords and their co-occurence patterns from a Dutch corpus.

Ground-truth IMPACT project

Collection of 99,95% correct OCR of books, newspapers, parliamentary papers and radio bulletins meant for training

Example set

This collection consists of a small selection of our digitised publications from the years 1870-1871.

Python API

Simple API to access KB collections using Python.

Keyword generator

A command-line tool to extract significant keywords from a collection of sample texts.

ALTO Edit

ALTO Edit is a simple browser-based post correction tool for ALTO XML files.

You are here