Datasets & Tools

Newspaper ngram collection

This dataset was generated by PoliticalMashup and contains yearly counts for word ngrams for n ranging

xportal

A simple search interface on Delpher data used for testing and demonstration.

Frame generator

Tool for extracting topics, keywords and their co-occurence patterns from a Dutch corpus.

Genre classifier

The Genre classifier predicts the genre of a Dutch newspaper article, using plain text as input.

Dictionary viewer

The Dictionary viewer visualises the appearance of a word list in the newspaper corpus over time.

KBK-1M

The KBK-1M Dataset is a collection of 1,603,396 images and accompanying captions of the period 1922 – 1994

Europeana Newspapers NER

Data set for evaluation and training of NER software for historical newspapers in Dutch, French, Austrian

Ground-truth IMPACT project

Collection of 99,95% correct OCR of books, newspapers, parliamentary papers and radio bulletins meant for training

Example set

This collection consists of a small selection of our digitised publications from the years 1870-1871.

Python API

Simple API to access KB collections using Python.

Keyword generator

A command-line tool to extract significant keywords from a collection of sample texts.

ALTO Edit

ALTO Edit is a simple browser-based post correction tool for ALTO XML files.

KB Newspapers image count

Graphical overview of images in KB Newspapers.

PoliMedia

PoliMedia allows cross-media analysis of coverage of parliamentary debates in a uniform search interface.

Newspaper ngram viewer

The PoliticalMashup ngram viewer visualises the frequency of a certain phrase in the Delpher newspaper collection.

You are here