Document Layout Analysis resources repos for development with PdfPig.
-
Updated
Oct 1, 2023 - C#
Document Layout Analysis resources repos for development with PdfPig.
Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)
Named Entities Recognition Annotator Tool for Europeana Newspapers
Convert between Tesseract hOCR and ALTO XML using XSL stylesheets
ALTO XML schema - latest and all former versions
Text Overlay plugin for Mirador 3
Kitodo.Presentation is a feature-rich framework for building a METS- or IIIF-based digital library. It is part of the Kitodo Digital Library Suite.
Data Mining Historical Newspaper Metadata (METS/ALTO formats)
QA-tool for scans with corresponding ALTO-files
Extract the MODS/ALTO metadata of a bunch of METS/ALTO files into pandas DataFrames for data analysis
A pipeline to transfer ground truth from Transkribus to eScriptorium.
Add a description, image, and links to the alto topic page so that developers can more easily learn about it.
To associate your repository with the alto topic, visit your repo's landing page and select "manage topics."