Skip to main content
Layer 1
Logo KB Lab
Hoofdnavigatie
Datasets
Tools
Tutorials
News and events
Blogs
About us
Affiliated researchers
Team
Contact
Secondary menu
NL
Open Menu
zoeken
Alto-44
Historical newspapers OCR ground-truth
A dataset consisting of 2000 pages historical newspaper groundtruth, OCR and images.
SIAMESET
The SIAMESET dataset consists of images and metadata of advertisements from two Dutch newspapers.
Dictionary viewer
The Dictionary viewer visualises the appearance of a word list in the newspaper corpus over time.
KBK-1M
The KBK-1M Dataset is a collection of 1,603,396 images and accompanying captions from 1922 – 1994
Keyword generator
A command-line tool to extract significant keywords from a collection of sample texts.
jpylyzer
Jpylyzer is a validator and feature extractor for JP2 (JPEG 2000 Part 1) images.