Preprint presenting a statistically validated decipherment of Beinecke MS 408 (the Voynich Manuscript). The manuscript uses a three-layer nomenclator cipher encoding medieval Italian medical text in the Regimen Sanitatis tradition. A decoded vocabulary of 1,106 unique lemmas assigns a reading to all 33,710 corpus tokens: 99.617% by frozen dictionary rules with no late-stage inference, 0.282% by validated or probable late-stage methods, and 34 tokens (0.101%) uncertain. A validation audit addresses circularity (folio-holdout rescoring: 25/33 pass (6 marginal, 2 fail); all unstable tokens confined to lowest confidence tiers), overfitting (ablation: base dictionary alone achieves 99.617%), and generalization (1,000-iteration bootstrap: seen-code accuracy 100.00%, unseen-code accuracy 96.59%). The supplementary data package includes the decode dictionary (5,036 entries), a 1,106-word glossary with POS tags and English glosses, the full EVA corpus, a deterministic decoder (Python, no dependencies), and a machine-readable validation summary. Code and data are available at https://github.com/d-w-h/voynich-nomenclator, interactive folio browser at https://d-w-h.github.io/voynich-nomenclator/.
Building similarity graph...
Analyzing shared references across papers
Loading...
Darren Helton
Building similarity graph...
Analyzing shared references across papers
Loading...
Darren Helton (Wed,) studied this question.
www.synapsesocial.com/papers/69e1cf985cdc762e9d8588b3 — DOI: https://doi.org/10.5281/zenodo.19597837