Skip to main content
Layer 1
lab

Hoofdnavigatie

  • Datasets
  • Tools
  • Tutorials
  • News and events
  • Blogs
  • About us

Secondary menu

  • NL

Alto-44

OCR scores

Historical newspapers OCR ground-truth

A dataset consisting of 2000 pages historical newspaper groundtruth, OCR and images.
An example of an image and caption extracted from the front page of the January 27th 1951 issue of the De Nieuwsgier.

KBK-1M

The KBK-1M Dataset is a collection of 1,603,396 images and accompanying captions from 1922 – 1994

Filters

Content

  • Newspaper (2)
  • Image (1)

Category

  • Data access (1)
  • Enrichment (1)
  • Text analysis (1)

File format

  • JPEG (2)
  • JPEG2000 (1)
  • TXT (1)
  • (-) ALTO (1)
  • (-) JSON (1)

Copyright

  • Public domain/CC0 (4)
  • Other CC-licence (1)
  • (-) In copyright (2)

Product

  • (-) Dataset (2)

In the KB Lab you can find experimental tools and data built for and from the digital collection of the KB, National Library of the Netherlands.

Footer-menu

  • Terms of use
kb-logo