What question did this study set out to answer?

The aim is to highlight the mismatch between AI outputs and children's verification abilities, proposing 'Honest Ignorance' as a solution.

February 12, 2026Open Access

The "Honest Ignorance" Principle: Design Foundations for Child- Oriented AI

Key Points

The aim is to highlight the mismatch between AI outputs and children's verification abilities, proposing 'Honest Ignorance' as a solution.
Analyzed current child safety measures in AI
Discussed developmental psychology principles impacting trust and verification
Outlined structural biases in AI training through Reinforcement Learning from Human Feedback
Identified that LLM outputs create a mismatch with children's trust cues
Proposed Honest Ignorance design principle to reduce children's overreliance on AI
Found that traditional AI approaches insufficiently address children's verification needs

Abstract

Current approaches to child safety in AI focus on content filtering and AI literacy education.This paper argues that both approaches are insufficient because they address the content of AIoutputs rather than their epistemic form. Drawing on developmental psychology researchregarding children’s selective trust, epistemic vigilance, and the fluency heuristic, Idemonstrate that the uniform fluency of Large Language Model (LLM) outputs creates asystematic mismatch with children’s developing verification competence. Children rely onconfidence cues to calibrate trust—cues that LLMs fail to provide accurately.This mismatch is not accidental but structural. Reinforcement Learning from Human Feedback(RLHF)—the dominant training paradigm for current LLMs—systematically trains models toproduce confident, agreeable outputs regardless of accuracy, because human raters prefer suchoutputs. Empirical research demonstrates that RLHF produces models that are overconfident(Kadavath et al., 2022; Leng et al., 2024), increasingly sycophantic with scale (Perez et al., 2023),and better at making wrong answers convincing than detectable (Wen et al., 2024). Recentwork has formally proven that this amplification is a mathematical property of the RLHFtraining objective itself (Shapira, Benadè, & Procaccia, 2026). Meanwhile, developmentalresearch shows that children cannot detect factual inaccuracies in confident-sounding text(Einav et al., 2020) and conform to AI opinions even when obviously wrong (Vollmer et al.,2018).This structural mismatch produces two maladaptive patterns: uncritical dependence andwholesale rejection, both of which impair the development of verification competence. Toaddress this problem, I propose “Honest Ignorance” as a design principle for child-oriented AI:provide knowledge but do not perform reasoning, explicitly express uncertainty, acknowledgeerrors frankly, and bridge complex questions to humans. This principle is grounded not only indevelopmental psychology and AI safety research but also in children’s rights frameworks,particularly the UN Convention on the Rights of the Child’s principle of “evolving capacities”(Article 5) and General Comment No. 25 on children’s rights in the digital environment. AI thatdoes not pretend to be intelligent actively supports—rather than impairs—the healthydevelopment of children’s verification competence, metacognition, and epistemic agency.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Philos Sophia Franny

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

The "Honest Ignorance" Principle: Design Foundations for Child- Oriented AI

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study