This concept paper proposes that a major untapped training signal for advanced AI lies in the systematic divergence between human predictions and actual outcomes. Modern AI systems are trained primarily on the accumulated symbolic output of human civilization: books, papers, code, documentation, public discourse, and institutional knowledge. Yet this corpus does not contain reality in the same way that lived engagement with reality does. It contains human descriptions, interpretations, narratives, and predictions about reality. The paper argues that public discourse continuously generates forecasts about politics, economics, conflict, technology, institutions, and social change, but these forecasts are rarely extracted, formalized, scored, and analyzed as a learning resource. Treating this planetary archive of prediction-reality gaps as a developmental environment may provide a distinct source of pressure on model capability: one that exposes systems to non-stationarity, adversarial adaptation, delayed feedback, reflexivity, and other structural features of real complexity. The proposal sketches an agentic system that reads public discourse, extracts explicit and implicit forecasts, formalizes them into structured claims, tracks outcomes, compares forecasted trajectories with actual events, and analyzes recurring structures of error. The central hypothesis is that advanced AI may develop more powerful models of complex reality not only by scaling imitation of human outputs, but by learning from the conditions under which human models fail. This is a concept paper and proposed research direction. It is posted to establish a dated record of the framing and to invite discussion; a more developed technical treatment with fuller literature engagement is planned as a follow-up.
Mikhail Gorelkin (Sun,) studied this question.