What question did this study set out to answer?

This research aims to explore how large language models from different architectures encode semantic information in similar geometric ways.

March 28, 2026Open Access

Cross-Architecture Semantic Geometry in LLM Residual Streams: Alignment, Null Controls, and Overfitting Bounds

Key Points

This research aims to explore how large language models from different architectures encode semantic information in similar geometric ways.
Utilized raw residual activations from five large language models across four families.
Tested alignment using a 64-prompt probe set and 50 Wikipedia controls.
Employed various PCA dimensionalities and Procrustes alignment techniques for analysis.
All models successfully distinguished structured prompts from control prompts with ARI=1.0 across all layers.
Semantic labels achieved linear decodability with 60–80% accuracy, outperforming chance levels of 25%.
High representational similarity observed between models, with CKA values ≥ 0.97.

Abstract

We investigate whether large language models of different architectures encode semantic structurein geometrically equivalent ways in their residual streams. Using raw residual activations (nosparse autoencoders), we test cross-architecture alignment across five models spanning four families(Gemma, Llama, Qwen, Mistral; 8B–123B parameters). Three findings are robust: (1) all testedmodels perfectly separate a structured 64-prompt probe set from 50 Wikipedia controls (ARI=1.0,at all tested layers); (2) semantic domain labels are linearly decodable within each model at 60–80%accuracy (chance=25%); and (3) representational similarity between independently trained modelsis high (CKA ≥ 0.97 between Llama 3.3 70B and Qwen 2.5 72B).We also identify a methodological problem in a commonly used alignment measure: the standardpipeline of PCA(50) + Procrustes on 64 points produces near-perfect cross-model transfer (95–100%)even for random-label controls, making it uninformative as evidence for shared geometry. Aconstrained analysis with PCA(5) reveals honest transfer rates: 66% for the original probe set, 94%for recipe cuisine types, 52% for animals, and 52% for random-label controls.We conclude that cross-model semantic geometry is partially shared—coherent taxonomies produceabove-chance transfer while random controls do not—but that Procrustes alignment at standardPCA dimensionalities is unreliable with small sample sizes. We propose a constrained protocol(multiple PCA dimensionalities with random-label controls) as a necessary methodological check forfuture cross-model alignment studies

Cross-Architecture Semantic Geometry in LLM Residual Streams: Alignment, Null Controls, and Overfitting Bounds

Key Points

Abstract

Cite This Study