Revised Description for Zenodo Publication: This dataset presents a comprehensive quantitative analysis of the complete Pali canonical literature using computer-assisted translation (CAT) software. All word counts and statistics were generated using RWS Trados 2024 applied to the Chattha Sangayana Tipitaka version 4.1 (CST 4.1), the most authoritative digital edition of the Pali Canon currently available. Scope and Methodology: The analysis encompasses the entire corpus of Pali Buddhist literature preserved in the Theravāda tradition, totaling 9,724,081 words across four major collections: the Tipiṭaka (root canonical texts), Aṭṭhakathā (ancient commentaries), Tīkā (sub-commentaries), and Añña (supplementary texts). Each text was processed individually through Trados 2024's word count function, providing precise statistical data unavailable in previous scholarship. Contents: Complete word counts for all 149+ individual texts in Pali Statistical breakdown by collection and sub-collection Page estimates (calculated at 250 words per page) Hierarchical organization reflecting traditional canonical structure Significance: This dataset addresses a critical gap in Buddhist Studies and digital humanities research by providing empirically grounded statistics for corpus-scale analysis of Pali literature. The 2.02:1 ratio of commentary to root text challenges assumptions about the relative scale of these textual layers. These statistics are essential for computational approaches to Buddhist textual studies, translation coverage assessment, and digital preservation planning. Technical Specifications: Source: Chattha Sangayana Tipitaka 4.1 (Vipassana Research Institute) Analysis Tool: RWS Trados Studio 2024 Language: Pali (original texts only) Data Format: Excel spreadsheet with hierarchical text classification Coverage: Complete Pali canonical literature (Theravāda tradition) Use Cases: This dataset supports research in Buddhist Studies, digital humanities, corpus linguistics, and AI-assisted canonical analysis. It provides the foundational statistics necessary for evidence-based claims about the scope and scale of Pali Buddhist literature.
Michail Xynos (Fri,) studied this question.