The Scansion generator is a tool developed within the KB Fellowship of Professor Marc van Oostendorp. As the first fellow in Digital Humanities, Van Oostendorp surveyed the historical development of rhythm in Dutch texts by means of this tool.
The tool marks the scansion of meter in Dutch modern poetry. It has dependencies on the Celex database. The web demo assumes that the submitted poem has the iambic pentameter - the command line version allows overriding this assumption with another meter.
The tool and technical instructions are available on Github.
When using the Scansion Generator, we request you to cite it as follows:
Koppelaar, H., Oostendorp, M. van, The Scansion generator (2013), KB Lab: The Hague http://lab.kb.nl/tool/scansion-generator
Simply paste in the Dutch poem you wish to process and click 'Submit'. The meter is represented as zeroes and ones. A 0 means unstressed, while a 1 means stressed.
The meter is coded internally as a string of ones and zeroes. For any given line of verse firstly the relative weight of the syllables in the individual words is determined. There are four weight classes for syllables: heavy, light, unstressed and unknown. This stress information is determined on the basis of the Celex information in combination with the following heuristics: apply these two rules, but only if the resulting meter lies closer (in the Levenshtein sense) to the ideal pattern of meter:
- by means of an apostrophe left out syllables do not count;
- if a word ends in a vowel, while its follow-up starts with a consonant, then these two syllables are seen as one syllable with the stress on the highest weighted syllable of the two syllables involved.
With this information of stress per word we do not have the meter yet, among other things because many words consist of one syllable. To determine the meter for as many syllables as possible the following rules are applied:
- if a syllable is stressed more than one of its neighbouring words and at least as much stress as the other neighbouring word then it is upgraded (coded as 1);
- if a syllable is less stressed than one of its neighbouring words and at least as much stressed as the other neighbouring word, then it is downgraded (coded as 0);
After this revision, the meter is still not finalised. From the remaining possible meters the meter is chosen that lies closest (in the Levenshtein sense) to the target pattern.
The tool was used in the research done by dr. Marc van Oostendorp during his time as KB Fellow. He finalised this fellowship with a closing lecture.