What question did this study set out to answer?

The study aims to provide a decipherment of the Voynich Manuscript as a Sri Lankan pharmaceutical text.

May 17, 2026Open Access

A Candidate Decipherment of the Voynich Manuscript: Evidence for a Spoken Elu-Sinhala Pharmaceutical Register (V7)

Key Points

The study aims to provide a decipherment of the Voynich Manuscript as a Sri Lankan pharmaceutical text.
Consolidated findings from versions V6 and V7 of the decipherment
Statistical analysis of 36,633 tokens for validation
Comparison against 27 rival languages in a tournament format
Identified approximately 90% confidence in the manuscript's pharmaceutical functions
Statistical validation indicated a 95× gap against rival languages
Achieved complete decoding with zero blank meanings reported

Abstract

Version 7 (2026-05-15) of a candidate decipherment of the Voynich Manuscript, proposing it as a 15th-century Sri Lankan Elu-Sinhala pharmaceutical text. This is the consolidated single-reference version incorporating all findings through V6 and V7. Candidate hypothesis. The manuscript is not encrypted; its script is a bespoke phonetic abugida mapping to Sinhala/Elu phonemes via 39 active rules. Its section-level function is no longer mysterious under this model: seven manuscript sections correspond to functional components of an Ayurvedic pharmacopoeia — a plant index (HERBAL), a production calendar (ASTRO), a nakshatra timing index (COSMO), an oleation procedure manual (BALNEO), a preparation interface (Rosette), a drug catalog (PHARMA), and a disease formulary (RECIPE). The identification carries approximately 90% confidence; the remaining uncertainty is dominated by sister-language indistinguishability and the need for specialist Sinhala/Elu philological review. Statistical validation (36, 633 tokens; DB commit d32bc5e; DB SHA256: 9de4c7032311ea627e0d89f5c04f7b4ced83c2369f4c0e630580e536081522a3). 27-corpus rival-language tournament: no tested tradition (Arabic, Tibetan, Tamil Siddha, European) above 0. 5% against Sri Lankan pharmaceutical controls at 66. 67% repeated locked-anchor metric — a 95× gap. All five Wickremasinghe phonological laws of Old Sinhala independently required by the decoder (convergence confirmed nine days after decoder freeze). 45/50 top decoded words cluster by section at p<0. 001 under proportional null. BM unordered concept-overlap screen ≥4: 137 matches (p=0. 018). 24/24 Team B validation gates pass. New in V7. VPNS two-tier encoding confirmed: high-frequency preparation-state markers coexist with low-frequency named ingredient tokens in the same formula lines. Five confirmed nakshatra identifications (Aśvinī, Anurādhā, Māghā, Kṛttikā, Puṣya). Complete visual cross-check of all 225 Beinecke facsimile folios. HERBAL opener grammar discovery: position-0 tokens encode preparation format, not species names — resolves the headline-label zero-hit result. 14 plant identifications meeting strict visual + phonological + cross-section criteria. Database complete: 0 blank decoded forms, 0 blank meaning assignments, ~202 soft-uncertain strings tracked separately. Rhetorical register recalibrated to match the ~90% confidence number throughout. What remains uncertain. Word-level accuracy is tiered: 11% dictionary-attested, 14% phonologically grounded, 36% rule-generated, 38% context-inferred. Botanical identifications (98/110 proposed) and nakshatra identifications (8 of 13 remaining candidates) require specialist blind review. The initial-sound gap (/b/, /v/ near-absent) is a noted decoder risk. No specialist Sinhala/Elu philological review of decoded prose has been conducted. AI-assisted methodology note. Computational pipeline, vocabulary analysis, and statistical validation were developed with AI coding assistance (Anthropic Claude). All statistical results are independently reproducible; canonical validation runner and checksums are in the GitHub repository. Reproducibility. Canonical runner: teambᵣerund32bc5e₂0260515/runcurrentdbₛuite. sh. Checksums: results/CHECKSUMS. sha256 (36 files, repo root). GitHub: https: //github. com/kamb-code/Voynich Original content CC-BY-4. 0. Bundled third-party corpora retain their own licenses.

A Candidate Decipherment of the Voynich Manuscript: Evidence for a Spoken Elu-Sinhala Pharmaceutical Register (V7)

Key Points

Abstract

Cite This Study