Model HTR Latin Sermons 11th–12th C.: Methodology, Transcription, Testing, and Evaluation

Authors

DOI:

https://doi.org/10.26806/hisape.n54.1

Keywords:

HTR, Latin homilies, medieval manuscripts, Transkribus, Digital Humanities

Abstract

Model HTR Latin Sermons 11th–12th C.: Methodology, Transcription, Testing, and Evaluation
This study introduces the methodology behind creating a Handwritten Text Recognition (HTR) model for Latin homilies from the 11th–12th centuries. The Latin Sermons 11th–12th C. model was trained on data originating from eight manuscripts held in libraries across five European countries, written in late Carolingian and early Gothic minuscule. The article focuses on preparing training data, selecting transcription rules, and testing parameters such as data volume, number of epochs, and the use of a base model. The final model achieved a Character Error Rate (CER) of 5.7 % and demonstrated high adaptability, thereby opening new possibilities for use within Digital Humanities.

Author Biographies

Michala Falátková, Catholic Theological Faculty, Charles University

Katolická teologická fakulta | Catholic Theological Faculty
Univerzita Karlova | Charles University

Mgr. Michala Falátková, Ph.D. (* 1988)

michaela.falatkova@ktf.cuni.cz

Barbora Kulhová, Catholic Theological Faculty, Charles University

Katolická teologická fakulta | Catholic Theological Faculty
Univerzita Karlova | Charles University

Bc. Barbora Kulhová (* 2002)

barbora.kulhova@ktf.cuni.cz

Published

2025-12-05

Issue

Section

Studies