Tools

Ot & Sien dataset

Introduction The goal of this dataset is to provide training and evaluation data for the development

Historical newspapers OCR ground-truth

A dataset consisting of 2000 pages historical newspaper groundtruth, OCR and images.

CHRONIC

The CHRONIC dataset consists of metadata for 313K classified newspaper images using computer vision techniques.

SIAMESET

The SIAMESET dataset consists of images and metadata of advertisements from two Dutch national newspapers.

KBK-1M

The KBK-1M Dataset is a collection of 1,603,396 images and accompanying captions of the period 1922 – 1994

You are here