Historical letters in archives are typically stored in fonds, based on letter recipients. To analyze the correspondences of a correspondent x, one has to aggregate the received letters in x’s own fond with the letters in the fonds of the correspondents yi who have received letters from x. To address this challenge, epistolary data services have been created by aggregating data from distributed heterogeneous archival data silos and fonds. This paper presents an overview of a new in-use data service and portal for this task, LetterSampo Finland – Finnish Nineteenth-Century Letters on the Semantic Web. In contrast to various legacy services online, this system is based on a Linked Open Data (LOD) Knowledge Graph (KG) in a SPARQL endpoint aggregated and harmonized from distributed heterogeneous data sources, in our case from 16 Finnish cultural heritage organizations and over 1,600 fonds. The new massive KG contains metadata about nearly 1.3 million letters sent or received in the Grand Duchy of Finland during 1809–1917, including also letter contents from four critical editions of correspondences of prominent Finns. The paper shows how this new KG and a portal on top of it can be used for searching and browsing letter data and for data analysis in digital humanities research. We show how the aggregated datasets are related to and enrich each other, pinpointing semantic challenges of data aggregation and linking processes needed. This kind of analysis is needed to make enriched LOD more transparent to the end user and to enhance data literacy for reliable computational analyzes.
Hyvönen et al. (Thu,) studied this question.