Here's a draft description for the Zenodo record: Description This dataset provides a quantitative inventory of untranslated texts in the Pāli Canon, derived from corpus analysis of the Chaṭṭha Saṅgāyana Tipiṭaka (CST 4.1) using RWS Trados Studio 2024. It documents the scope of Pāli literature that remains inaccessible to English-reading audiences across the three textual layers of the Theravāda tradition — the root Tipiṭaka (Mūla), the commentaries (Aṭṭhakathā), and the sub-commentaries (Ṭīkā) — together with the Añña (extra-canonical) layer. For each text, the dataset records the title, textual layer, Trados-verified word count, current translation status, and source URL where available. The corpus was segmented into discrete CST 4.1 project files and processed through Trados Studio 2024 to produce reproducible word-count data. Translation status was determined through systematic review of published English translations (Pali Text Society, Wisdom Publications, BPS, SuttaCentral, and individual scholarly translations) cross-referenced against community translation initiatives. The dataset underpins the quantitative claims in Xynos (2025) regarding the Pāli literature accessibility gap: while approximately 86% of the root Tipiṭaka is available in English translation, only ~5% of the Aṭṭhakathā layer has been rendered into English, leaving overall Pāli literature accessibility at approximately 30%. These figures support the rationale for commentary-augmented translation systems such as PaliVerse. Intended uses Citation in academic work on Pāli translation, Buddhist Studies, and digital humanities Reference inventory for translators, scholars, and publishers planning translation projects Baseline data for measuring future progress in Pāli translation coverage Input for computational and AI-assisted translation research targeting the Pāli Canon Methodology Word counts were generated using RWS Trados Studio 2024's native analysis function applied to CST 4.1 source files. Translation status reflects the state of published English translations as of 2025. Citation Xynos, M. (2025). Untranslated Pāli Texts: A Corpus Analysis of Translation Coverage in CST 4.1 Based on RWS Trados Studio 2024 Analysis Dataset. Zenodo. https://doi.org/DOI assigned on publication
Michail Xynos (Wed,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: