What question did this study set out to answer?

This research aims to create a model that keeps physical and linguistic representations separate while allowing for reversible combinations.

March 27, 2026Open Access

Dual-Plane World Model: Reversible Composition of Physical and Linguistic Embeddings

Key Points

This research aims to create a model that keeps physical and linguistic representations separate while allowing for reversible combinations.
Proposed a dual-plane model that separates physical and linguistic embeddings.
Utilized reversible addition operation to combine these embeddings.
Conducted experiments using synthetic data to test the model's effectiveness.
Achieved a physics recovery error of 0.0109.
Demonstrated the ability to recover physical representations without linguistic context.
Showed successful zero-shot generalization through meaning rather than language tokens.

Abstract

Existing multimodal models (CLIP, ImageBind) align physical and linguisticrepresentations in a single shared space — losing the structure of each modality inthe process. We propose an alternative: keep the physical and linguistic planesseparate, combining them through a reversible addition operation. The centralclaim is that if physics + language = combined, then the physical plane can berecovered from the combined embedding without any linguistic context — purelythrough subtraction. Experiments on synthetic data confirm the viability of thisarchitecture: the physics recovery error was 0.0109, demonstrating zero-shotgeneralization through meaning rather than through language tokens.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Artem Gorbunov (Wed,) studied this question.

synapsesocial.com/papers/69c620d515a0a509bde19829 https://doi.org/https://doi.org/10.5281/zenodo.19218011

Bookmark

View Full Paper