Introduction
With new legislation (the European Accessibility Act) on the horizon, there is a growing interest in making e-books and digital content services accessible. Publishers focus mainly on how to make new e-books accessible, and little is known about how to ‘remediate’ existing digital content. Libraries have a lot of public domain publications: what does it take to make some of those accessible too? What are the advantages for the intended users? What tools or workflows exist to do conversion efficiently? What are realistic cost estimates?
To answer these, we asked specialists to convert several public domain (and CC0) publications into accessible versions. All these versions are made available here for comparison and testing.
Our first goal is to get a better understanding of how these accessible versions, help real people. We welcome feedback and are open to discuss further improvement. Next to original ePub2 and PDFs we provide accessible ePub3 and PDF/UA (Universal Access). We will also work on improved accessibility of audiobooks and intend to make ePubs with audio overlay available in near future.
Apart from user testing, these files can also help with evaluating ‘reading’ systems (e-reading apps and hardware). Given that the document standards leave some room for interpretation, and developers of software often do not implement all features, not everything works always as expected. With these test files, we hope to cover a wide range of document types and features.
Any reuse is permitted given that all accessible format documents are made from CC0 licensed or public domain, while the accessible format documents themselves are CC0 licensed, so that everyone may reuse them without restrictions.
A very convenient way to download and read these titles is by use of Thorium Reader. Thorium Reader is free and open source, available on Windows, Mac OS and Linux, and developed by the European Digital Reading lab (of which KB is member). Instead of downloading titles one by one and importing them into this reader, it is also possible to add a Catalog by OPDS feed. The link you have to use for this is:
https://kbresearch.nl/epub2opds/opds.atom
The individual books can also be downloaden from the examples page.
When using this dataset we ask you to cite it as follows;
T. van der Togt, Accessible e-books and audiobooks (version 1, 14-10-2021) KB Lab, the Hague.
Examples
The first dataset contains 17 e-books and audiobooks, but more titles and also different (audio) formats will be added in near future.
- Alice in Wonderland 1: the ‘original’ ePub from Wikisource, (ePub2) PD Mark
- Alice in Wonderland 2: an accessible version of 1, (reflowable ePub3.2) CC0
- Alice in Wonderland 3: an alternative accessible version of 1, (reflowable ePub 3.2) CC0
- Alice in Wonderland 4: an accessible version of 1, with a fixed layout made after a copy at the Internet Archive, (fixed layout ePub 3.2) CC0.
- Alice in Wonderland Audio: the original audiobook version of Librivox, packaged as W3C Audiobook PD Mark
- Open a GLAM Lab 1: the original open access publication from Glamlabs.io, (ePub2). CC0
- Open a GLAM Lab 2: an accessible version of 1 (ePub3.2) CC0
- Open a GLAM Lab PDF: the original open access publication from Glamlabs.io, (PDF). CC0
- Open a GLAM Lab PDF/UA: a accessible version of the PDF (PDF/UA) CC0
- Open a GLAM Lab Audio: an audio version narrated from the original, Packaged as W3C Audiobook. CC0
- De Lotgevallen van Ferdinand Huyck 1: the ‘original’ copy from the eBook eregalerij (ePub2) PD Mark
- De Lotgevallen van Ferdinand Huyck 2: an accessible version of 1 (ePub 3.2) CC0
- Max Havelaar 1: the ‘original’ copy from the eBook eregalerij (ePub2) PD Mark
- Max Havelaar 2: an accessible version of 1 (ePub 3.2) CC0
- Max Havelaar Audio, the original audiobook version of Librivox. packaged as W3C Audiobook. PD Mark
- Eva 1: the ‘original’ copy from the eBook eregalerij (ePub2) PD Mark
- EPub3 testbook: a title specifically made for testing many different book elements (ePub 3.2) CC0