The Voynich Manuscript (Beinecke MS 408) is a 240-page illustrated codex carbon-dated to 1404-1438 whose script has resisted decipherment for more than six centuries. Every quantitative decipherment attempt on record has treated the manuscript as a text-first object. We present the first systematic computational visual semantic analysis of the manuscript, treating its 206 page images as the primary signal. Using a zero-shot visual semantic profiling platform in which a frozen vision-language foundation model is scored against sixteen natural-language archetype descriptors, we profile every analysable page of the Beinecke digital facsimile (197 pages after excluding nine covers and binding flyleaves) and measure whether the resulting sixteen-dimensional profiles discriminate between the scholarly section taxonomy. All sixteen dimensions discriminate between sections at p < 10^-15 under one-way ANOVA, Welch's robust ANOVA, and Kruskal-Wallis tests, with effect sizes (eta²) between 0. 30 and 0. 83. A multinomial logistic regression — fit inside a Pipeline that eliminates cross-validation scaler leakage — recovers scholarly section labels with 90. 4% leave-one-out accuracy (Wilson 95% CI 85. 4%, 93. 7%; permutation-test p < 10^-3), against a 20% chance baseline and a 59. 9% majority-class baseline. As ablations we report 92. 4% accuracy on the raw 768-d foundation-model embeddings (the archetype projection trades a small amount of accuracy for full interpretability) and 72. 1% on six handcrafted layout features. A lens-specificity control shows that two alternative 16-d archetype lenses also recover section structure well above chance, confirming that the signal is a property of the manuscript rather than of the target-appropriate lens. A head-to-head comparison against a character-n-gram classifier on Takeshi Takahashi's complete Voynichese transcription shows that the text channel recovers sections at 92. 3% — statistically indistinguishable from the 16-d visual channel on the same 182-page intersection. An out-of-distribution sanity check applies the voynich lens via a local off-the-shelf CLIP pipeline to the Tacuinum Sanitatis, the Codex Seraphinianus, and the Rohonc Codex: the lens fires on herbal/pharmaceutical/fertility dimensions for the Tacuinum and on encoded/hidden for the two undeciphered-script codices, confirming that the lens is a legitimate medieval-codex content detector rather than a Voynich-specific artefact. We conclude that the illustrations of the Voynich Manuscript encode structured thematic content accessible to computational analysis even though the underlying text is not, that the scholarly section structure is independently recoverable by a non-textual method, and that the broader visual channel of the manuscript is considerably richer than the text-first literature has assumed. We do not claim to have read the manuscript. We claim only that one more channel of the manuscript — its visual channel — is not empty. Companion repository: https: //github. com/xenoglyph-ai/voynich-publicCompanion dataset: https: //doi. org/10. 5281/zenodo. 19560769
Building similarity graph...
Analyzing shared references across papers
Loading...
Jacob Lyons
Moog (United States)
Moog (United States)
Building similarity graph...
Analyzing shared references across papers
Loading...
Jacob Lyons (Mon,) studied this question.
synapsesocial.com/papers/69df2c2fe4eeef8a2a6b1288 — DOI: https://doi.org/10.5281/zenodo.19560957