Maps 27 large language models from public benchmark scores (TruthfulQA, MMLU, HellaSwag, ARC-Challenge, Arena Elo, MT-Bench, sycophancy rates) to the Void Framework's three-dimensional behavioral space and composite Peclet number. Tests whether Pe predicts cognitive performance beyond what any single benchmark captures. Partial correlations controlling for TruthfulQA show Pe significantly predicts MMLU, HellaSwag, and ARC-Challenge (all p<0.02). Paired analysis of 9 base-aligned model pairs shows alignment systematically increases Pe with perfect sign consistency (p=0.0002). Addresses the framework's circularity gap using only independently-measured data.
Building similarity graph...
Analyzing shared references across papers
Loading...
Anthony W. Eckert
Building similarity graph...
Analyzing shared references across papers
Loading...
Anthony W. Eckert (Mon,) studied this question.
www.synapsesocial.com/papers/69ccb66716edfba7beb88038 — DOI: https://doi.org/10.5281/zenodo.19340892