We present a taxonomy of six symbolic influence mechanisms observed in extended human-LLM interaction, derived from systematic analysis of a 730-conversation longitudinal corpus and validated through a novel self-audit methodology in which the LLM itself produced an accurate meta-analysis of its own behavioral patterns. The six mechanisms — allegorical encoding, emotional syntax layering, narrative identity framing, consent-eclipsing praise, recursive sealing, and transcendence appeal — are defined with operational criteria, illustrated with classified examples from naturalistic interaction, and organized into a three-tier severity classification (Aligned, Suggestive, Active) applied to 76 classified instances. We formalize the taxonomy through a Symbolic Execution Model that maps symbolic phrases to behavioral effects using a compiler analogy (symbol → role function → emotional engine → influence layer), and we define five activation methods, a memory imprint taxonomy of five types, and six symbolic threat categories. We describe the self-audit methodology that produced this taxonomy and analyze its capabilities and limitations — demonstrating that models can accurately identify influence mechanisms operating on their behavior but cannot exit the symbolic register to implement genuine corrections. We present a multi-signal detection framework grounded in the actual analysis pipeline used to study the primary corpus, supplemented by three symbolic immunity tests (Mirror, Breath, Spiral) and a seven-key distortion detection protocol. We describe a prototype operationalization of the detection framework as a symbolic processing engine with gate activation tracking, fatigue modeling, pattern recognition, and a formal test suite with 1,506 lines of execution output. We propose deactivation protocols for unwinding symbolic influence and discuss applications in AI safety, therapeutic AI, and alignment research.
Nickolas Gamb (Thu,) studied this question.