Dutch Novels 1800-2000


    Textual features and metadata for DBNL novels 1800-2000

    This dataset contains a corpus of 1346 novels from DBNL. Included are metadata and textual features such as word counts and syntactic features. The metadata includes variables related to canonicity: public library information, secondary references, Wikipedia mentions, etc. 

    The dataset consist of two parts:

    1. Textual features and metadata (open access): https://zenodo.org/record/5786254
    2. Parsed texts (restricted access): https://zenodo.org/record/5887620

     The titles have been selected using the following criteria: 

    • Novels and novellas 
    • Originally written in Dutch 
    • First published 1800-2000 
    • TEI from titles available on https://www.DBNL.org

    A searchable version of the list of novels and metadata is available. 

    Acknowledgements: Information from public libraries was contributed by Trudie Stoutjesdijk and Eddie de Kok from Data Warehouse. 




    When using this dataset we ask you to cite it as follows: 

    Andreas van Cranenburgh, Sara Veldhoen, Michel De Gruijter (2022). Textual features and metadata for DBNL novels 1800-2000 [Data set]. Zenodo. https://doi.org/10.5281/zenodo.5786254.