This paper introduces an AI-assisted human-centered and minimalist software stack and data model to structure and store early modern serial sources related to early-modern Catholic Church administration. The Vatican Archive preserves vast quantities of documents recording its administrative history. To date, the sheer volume and technical character of these Latin manuscripts have made systematic study appear nearly impossible. The multinational project GRACEFUL17 unfolds seventeenth-century Church governance on a large scale with the help of AI. It leverages simple but efficient NLP (NER, span categorizer, fuzzy searches) and classifier (gradient boost) techniques that run fast, reliably, and reproducibly to allow for multi-user offline work environments, as well as quick but controlled data modelling in a knowledge graph. By documenting this workflow, the paper enhances replicability and provides a rationale for specific design decisions beyond technical documentation. This paper advocates the use of “weak AI” on several grounds. Functionally, non-LLM pipelines offer stricter controllability and avoid many of the semantic biases introduced by large language models. They also require fewer training overheads and run locally with ease. Methodologically, the combination of simple AI models and symbolic reasoning underscores the indispensable role of human expertise: only experts can provide the ground truth necessary for models to reproduce and formalize complex semantic concepts and phenomena, rather than outsourcing this interpretive work to foundation models.
Christoph Sander (Fri,) studied this question.