November 24, 2025Open Access

Analyzing spelling patterns in the manuscripts of the tales of Canterbury

Key Points

Key points are not available for this paper at this time.

Abstract

Abstract For decades, scholars have suggested that analysis of spellings in medieval European manuscripts might be useful in understanding who wrote the manuscripts and where and when they were written. The increased availability of full-text transcripts of manuscripts is creating larger sets of data and has opened the possibility of using quantitative methods. This article reports on analysis of spellings in manuscripts of Geoffrey Chaucer’s Book of the Tales of Canterbury. The analysis was successful in confirming long-held beliefs, based on traditional paleography, that multiple manuscripts can be identified as written by the same scribe: these manuscripts are closely aligned in their spelling patterns. Further, the analysis showed that manuscripts written by the same scribes might be closely aligned in spellings even though they are copied from exemplars with significantly different texts. The analysis also suggested some unexpected linkages among the manuscripts, which found support in examination of the text in those manuscripts. Three sets of spelling data were submitted to the analysis: one set with all spellings assigned to regularized forms; a second with spellings sorted by headword and part-of-speech; a third with completely unsorted “bags of words” for each document. While the more structured data did yield more granular results, these gains seemed relatively slight compared to the extra effort required to create the data.

Read Full Paperexternally

KI fragen

Bookmark

View Full Paper