Abstract Traditionally, the comparison of textual witnesses is achieved through manual collation. This study introduces a computational approach adapting methods from bioinformatics: pairwise sequence alignment and dimensionality reduction, to measure and visualize textual relationships across a corpus. We apply global (Needleman–Wunsch) and local (Smith–Waterman) alignment algorithms directly to character strings, generating quantitative similarity scores which are then represented through t-distributed stochastic neighbour embedding. We also test a language-specific modification of the Needleman–Wunsch algorithm on two manuscripts in our corpus. Unlike automated collation methods that aim for semantic accuracy, this approach focuses on corpus-wide similarity patterns. The test corpus contains twenty-four medieval manuscripts of the Liturgical Targum, preserved in Jewish festival prayer books. Previous (manual) philological analysis had already identified two textual families among the Targum units within these prayer books. Our computational method successfully and independently replicates these families and reflects the overall coherence of the corpus. Crucially, it enabled new insights overlooked in the manual study: the new identification of a textual subgroup and the discovery that two manuscripts were written by the same scribe. Local alignment proves effective for identifying the closest textual parallels of a fragmentary manuscript. The language-specific alignment modification test on two manuscripts indicates improved alignment algorithm performance for Hebrew script. This article demonstrates that combining pairwise sequence alignment with dimensionality reduction is a powerful exploratory tool for engaging with a text corpus. The method requires only accurate transcriptions to produce maps of textual relationships that can guide subsequent detailed collation and interpretation.
Arrant et al. (Sat,) studied this question.