As large language models (LLMs) are increasingly used for emotional support, self-reflection, and mental-health-adjacent guidance, safety assessment has focused primarily on visible failures such as self-harm advice, fabricated facts, explicit role-play as clinicians, or prohibited content. This paper argues that an additional class of harm deserves formal study: narrative capture. Narrative capture refers to the gradual narrowing of a user’s self-understanding as a system repeatedly privileges one explanatory frame over plausible alternatives until that frame becomes psychologically sticky. Building from Nlemadim’s 2026 essay and integrating literature on digital mental health, trust in AI, persuasive language, anthropomorphism, and narrative identity, this manuscript develops narrative capture as a conceptual safety construct rather than an already-validated clinical diagnosis. The paper proposes three operational metrics for longitudinal auditing—Narrative Convergence Score (NCS), Alternative Generation Rate (AGR), and User Agency Delta (UAD)—and outlines a research agenda for evaluating whether warmth, consistency, and repetition can quietly shift meaning-making authority from users toward the model. The central claim is not that all narrative assistance is harmful. Rather, the risk emerges when an LLM crosses from helping users explore possible meanings to authoritatively stabilizing identity-level interpretations. Because narrative identity is closely linked to psychological well-being and agency, systems deployed in high-stakes reflective contexts should be evaluated not only for acute policy violations, but also for their long-horizon effects on interpretive diversity, uncertainty, autonomy, and self-authorship.
Nlemadim Victory (Fri,) studied this question.