Dr Annemieke Romein was a researcher-in-residence at the KB in 2019.
Her project ‘Entangled Histories of Early Modern Ordinances’ explored the question: How do we improve the OCR quality of early modern legal texts and can we automatically add metadata to these texts?
Project
Annemieke Romein explored the domain of ordinance books, collections of historical norms and administrative laws. Her first blogpost introduces this topic and discusses the challenges of digitizing these ordinances using HTR. The second blogposts goes into the collection of the corpus and applying HTR techniques, as well as an initial analysis of the corpus. Two bonus blogposts dive deeper into the technical side of the project, discussing Automatic Text Recognition (ATR) and segmentation, respectively.
The project resulted in the Entangled Histories dataset, containing 108 Dutch books of ordinances.