Skip to main content
Layer 1
lab

Hoofdnavigatie

  • Datasets
  • Tools
  • Tutorials
  • News and events
  • Blogs
  • About us

Secondary menu

  • NL

Tutorial

logo juypter book, automatically extract XML content with Python

Automatically extract XML content with Python

A quick-start into working with XML files using Python. The course covers various XML formats.
Clockwork picture of an itinerant dentist performing an extraction in French rural scene, wood frame, metal workings, first half 19th century. Science Museum, London. Attribution 4.0 International (CC BY 4.0) (cropped from original).

Extracting text from EPUB files in Python

Johan van der Knijff published a brief introduction to extracting unformatted text from EPUB files.

Filters

Content

  • Book (2)
  • Manually corrected text (1)
  • Newspaper (1)

Category

  • Data access (1)
  • Digital Preservation (1)
  • Text analysis (1)

File format

  • ALTO (1)
  • MPEG21-DIDL (1)
  • TEI (1)

Copyright

  • Other CC-licence (1)
  • Public domain/CC0 (1)

Product

  • Tool (26)
  • Dataset (19)
  • (-) Tutorial (2)

In the KB Lab you can find experimental tools and data built for and from the digital collection of the KB, National Library of the Netherlands.

Footer-menu

  • Terms of use
kb-logo