As Researcher-in-Residence at the KB National Library of the Netherlands (The Hague), I – Annemieke Romein – look for early modern norms in books of ordinances (‘plakkaatboeken’). These books hold mainly administrative laws, varying from norms against beggars, safety, but also laws intended to stimulate the economy, and international treaties are frequently included too. Together with colleagues at the library, we are looking for means to have the computer take over the process of categorising these texts. The question is: Can a computer recognise the texts containing rules, and distinguish between texts on religion, maintenance of dykes, colonial affairs, etc.?
Currently, I work as a postdoctoral researcher at Ghent University, where I focus on both handwritten and printed norms from the Low Countries. My current project (called: ‘Law and Order: Low Countries?!’) focuses on the federation-states of Flanders and Holland between 1576 (Pacification of Ghent) and 1702 (Spanish War of Succession). Initially, I am looking at the topics of the published norms. Personally, I find the rules on safety and security most intriguing as the topics are still relevant today, however, when you are interested in a selection of the texts, it is necessary to go through all of them. This gave rise to my initial question of whether the computer can be trained on the process of distinguishing texts from one another. I am very grateful for the chance to work at the Dutch National Library and on my project ‘Entangled Histories’ for six months.
My interest in the topic of legal history – more specifically Policeygesetzgebung and Policeywissenschaften (police norms) – was peaked in 2005 when I was looking for a Bachelor thesis topic. Professor Robert von Friedeburg had me read some German articles (Karl Härter/ Michael Stolleis, Policey im Europa der Frühen Neuzeit (Frankfurt 1996)) which gave – at that time – the impression that norms required a strong authoritarian type of rule: a top-down approach that one could only find in principalities or city-states. I was asked why similar texts were available in books of ordinances in the Netherlands, and this question has gradually evolved into my current projects.
In the early modern period – rules were announced by the city crier. He walked through the city, or rode a horse, to visit indicated locations to read new rules to the inhabitants. After reading them aloud, the printed texts were fixed to “well-known places” (e.g. church doors, at the market square) for people to be able to reread them. An estimated 70% of people could read in the Republic, so for the remainder of the inhabitants having the new rules read aloud was still very important. The rules had to make sense so people could remember them by heart. Hence, there is a repetitiveness in the texts – which makes sense given that the 16th and 17th century had an important oral tradition.
Ordinances, or placards, were affixed to known places. This made them official. The provincial estates considered it to be important to print a selection of their agreed-upon texts in books of ordinances. These formed a source for lawyers as a reference work. These books do not form a complete overview, but they do give a sense of what government officials deemed important.
The Dutch Republic, like the Habsburg Netherlands, were federations of autonomous states. In the Republic, the Estates-General held sovereign powers and in each of the federation-states the estates held the highest power. The Republic’s Stadtholder(s) was/were officially civil servants. The Habsburg Netherlands differed from the Republic as they had a sovereign prince (the King of Spain), though the federation-states did have a certain amount of freedom. They had to verify new rules did not jeopardise traditions and customs. Yet, when you look in many history books the scene is depicted as a game of thrones. This results in poorly studied political-institutional constellations of republics (the Republic and Switzerland alike). I find this very intriguing: we actually know too little to say something concrete about how federation-states without a prince were ruled! While Belgium is a long tradition of republishing the rules through the Royal Commission for the Publication of Ancient Laws and Ordinances (since 1846), the Netherlands does not hold such an institute. Such publications offer a rich source of information.
In the Netherlands, we do not have an overview of what laws were published and, hence, we cannot say anything substantial about the legislation in the Low Countries, or differences among the federation-states, or even in comparison with other areas in Europe. This is fascinating because my current research in Ghent gives rise to the suspicion that the differences between Holland and Flanders – both very trade-oriented – were not that big. Hence, my current hypothesis is that there was little differences between an indirect-ruled princely state and the federation-state Holland (as part of the Dutch Republic).
In order to study the potential differences, the first step in the project is to analyse the layout of the pages and enhance the readability of the early modern texts. As humans, we can read the texts fairly well, but gothic fonts prove to be more challenging than roman fonts. The computer does not mind, rather it considers it a challenge. Within the Google Books project an OCR (Optical Character Recognition) has been applied, but, when you attempt to copy the texts (to a notepad for example) it will render useless due to the numerous errors.
These errors directly influence the searchability of the texts, even if you are only looking for keywords. In order to improve this, we apply Handwritten Text Recognition. This technique is developed within the European funded READ-project, which resulted in the application called Transkribus. The computer program is trained through manual transcriptions of a single hand, or, in our case, a font. Through this training, the computer learns how certain characters are represented. Furthermore, the program places the characters within a context by looking at complete lines instead of individual characters. This is called an N-Gram, which remembers the most frequent character combinations (such as a ‘q’ is always followed by a ‘u’).
We basically fool the computer by claiming that our printed sources are very regular handwriting, which will enormously help with recognition. I heard about these techniques before, but I had never applied them to printed texts. I became curious about whether they would be as successful as claimed. Would this work on the 108 books of ordinances that we retrieved from various digital collections? So far the results are giving us hope, and a reason to believe we can apply other techniques to analyse individual texts. Where I am currently still hand-labelling the individual texts in Ghent, the question arises whether the computer can do this also. I use the same categories as the Repertorium der Policeyordnungen-project once did, so that the datasets can merge at some point through LinkedData analysis. Can the High-Performance Computer (Artificial Intelligence) with the program Annif read through the texts and spot words that triggered me into categorising them in a certain way? Then, apply this to other books of ordinances? While working at the KB, I hope to find the answer to this question.
This blog was previously published – in an altered version – in Dutch at https://www.kb.nl/blogs/digitale-geesteswetenschappen/plakkaten-classifi... and is a reblog from http://esclh.blogspot.com/2019/08/project-presentation-digital-humanitie...