Thomas Smits was a researcher-in-residence at the KB in 2017.
His project ‘Illustrations to Photographs: Using Computer Vision to Analyse News Pictures in Dutch Newspapers, 1860-1940’ explored the question: How can we sort images from digitised Dutch newspapers by type of image?
Project
Thomas Smits explored the domain of illustrative images of the news using computer vision and image processing techniques. He collected a dataset and employed computer vision techniques to categorize the images according to the technique used for their reproduction: engravings or half-tone, as well as into a category subdivision.
In his first blogpost he discusses the topic of digital humanities, of using n-grams and distant reading to research digital corpora, and the importance of bringing quantitative and qualitative research together. The second blogpost goes deeper into the project itself, introducing the domain and the techniques used.
The work resulted in the CHRONIC (Classified Historical Newspaper Images) dataset, as well as the CHRONReader tool, which allows searching of images by visual characteristics, time period, category, and keywords.