What question did this study set out to answer?

The aim is to address the structural challenges in AI-native information systems regarding economic sustainability and epistemic integrity.

May 12, 2026Open Access

The Source Integrity Layer: Presence × Integrity and the Governance of AI-Native Information Distribution

Key Points

The aim is to address the structural challenges in AI-native information systems regarding economic sustainability and epistemic integrity.
Proposes a Source Integrity Layer as a governance architecture for AI information systems.
Analyzes the historical context of web information distribution and its transition to AI-mediated outputs.
Defines key components for sustainable AI governance, including provenance metadata and manipulation detection.
Identifies the collapse of the crawl-and-return model as a risk to source integrity and AI quality.
Demonstrates how current AI systems are vulnerable to source attacks due to inadequate governance.
Proposes a framework that balances open access with accountability to promote trustworthy AI outputs.

Abstract

The Source Integrity Layer: Presence × Integrity and the Governance of AI-Native Information DistributionCivilization Physics — AI Information Systems they are dynamic information systems whose future outputs depend on the quality of present inputs. Once AI systems mediate what gets seen, trusted, and recopied, the integrity of their source environment becomes part of the systems’ own cognitive stability. To formalize this, the paper extends a three-layer model of data: Surface-linguistic data — stylistic and token-level patterns. World-model data — causal and factual structure about reality. Judgment data — structured human evaluation regarding correctness, trustworthiness, and value. The paper argues that the third layer—judgment data—is the decisive bottleneck for durable AI quality. Large quantities of social-media exhaust or interaction data may provide scale and immediacy, but they do not automatically provide high-integrity judgment signals. The xAI case is used as a motivating example: despite massive access to real-time conversational data and compute resources, xAI did not secure frontier AI leadership because access to noisy interaction streams did not solve the harder problem of structured judgment and epistemic governance. A central claim of the paper is that current AI systems already face an expanding source attack surface. Optimization practices such as Generative Engine Optimization (GEO), prompt injection, retrieval poisoning, and judgment capture demonstrate that AI answer systems are becoming direct targets for manipulation. Once recommendation and answer layers mediate visibility and trust, actors optimize not only for discovery but for influence over synthesized outputs themselves. This leads to a critique of two inadequate governance extremes: Naive open crawling — treats all publicly accessible content as equivalent, exposing systems to poisoning, manipulation, and synthetic amplification. Closed whitelisting — over-constrains source diversity, producing epistemic starvation and reducing contact with heterogeneous reality. The paper argues that strong AI systems require both broad contact with reality and mechanisms for weighting, verification, and audit. This principle is formalized through Presence × Integrity: Presence — continuous exposure to diverse, living, real-world inputs, including minority perspectives, edge cases, and contested domains. Integrity — provenance, accountability, editorial transparency, manipulation resistance, and auditable process. Presence without integrity produces informational chaos. Integrity without presence produces brittle epistemic closure. Sustainable AI-native knowledge systems require both simultaneously. To operationalize this principle, the paper proposes a Source Integrity Layer, defined as a machine-readable governance architecture for AI information systems. The proposed architecture contains seven core components: Open source registries with structured identity metadata. Provenance metadata attached to retrievable documents. Trust-weighted retrieval systems incorporating editorial and contextual signals. Manipulation-detection infrastructure for poisoning and injection attacks. Distributed human audit nodes for high-risk judgment domains. Source-return mechanisms reconnecting AI use with economic feedback to sources. Appeals systems preserving procedural legitimacy and correcting governance errors. A major contribution of the paper is the proposal for stratified source governance rather than binary trust systems. Sources are organized into open-web, verified, contested, and judgment layers, each with different weighting, oversight, and review mechanisms. Verification is defined not as ideological certification, but as validation of identity, provenance, and process. The paper concludes that AI-native information distribution cannot remain stable under either unrestricted extraction or closed epistemic control. Sustainable AI systems require an open-but-integrity-weighted architecture capable of preserving both diversity and trust. Within the Civilization Physics framework, this work establishes a broader principle: AI systems remain cognitively stable only when their information environments continuously inject structured, reality-bearing negative entropy. The Source Integrity Layer is proposed as the institutional mechanism for maintaining that condition at scale. Keywords: Source Integrity · AI Governance · Information Systems · Provenance · Negative Entropy · Judgment Data · Retrieval-Augmented Generation · AI-Native Distribution · Presence × Integrity · Civilization Physics

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Xiangyu Guo

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

The Source Integrity Layer: Presence × Integrity and the Governance of AI-Native Information Distribution

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study