What question did this study set out to answer?

This research investigates how mechanistic interpretability can enhance the credibility of AI empathy in e-commerce customer service.

May 31, 2026Open Access

When Algorithms Speak Louder than Empathy: Mechanistic Interpretability as a Costly Authenticity Signal in AI-Mediated E-Commerce Customer Relationships

Key Points

This research investigates how mechanistic interpretability can enhance the credibility of AI empathy in e-commerce customer service.
Conducted two between-subjects experiments with Chinese consumers via the Credamo online panel.
Study 1 compared high and low AI empathy effects on brand intimacy.
Study 2 tested the interaction of AI empathy and mechanistic interpretability in a factorial design.
Study 1 found high AI empathy decreased brand intimacy due to lower perceived authenticity.
Study 2 replicated this effect, showing that interpretability reversed the negative impact of empathy on authenticity.
Mechanistic interpretability neutralized the negative indirect effect of empathy on brand intimacy.

Abstract

Conversational AI agents are now a routine touchpoint in e-commerce customer service, and AI empathy has emerged as the headline humanization strategy for repairing relational damage during service failures. A growing evidence base reports that empathic AI often backfires, because consumers cannot reconcile felt warmth with their lay model of what an artificial agent is. This research asks under what conditions AI empathy can be made credible to consumers. We propose that mechanistic interpretability, operationalized in the present studies as a consumer-facing visualization of an AI agent’s internal emotion-vector activations designed in the style of mechanistic-interpretability research, operates as a costly authenticity signal that rehabilitates empathic AI by enabling an attributional shift along the experience dimension of mind perception. Signaling Theory carries the antecedent stage of the causal chain, where mechanistic interpretability serves as a verifiable cue of computational authenticity. Mind Perception Theory carries the downstream stage, where the authenticated empathy is converted into consumer-brand intimacy. Two between-subjects experiments preceded by a feasibility pilot tested the account on Mainland Chinese consumers recruited via the Credamo online panel. Study 1 used a single-factor design contrasting high versus low AI empathy. Study 2 used a two (AI empathy) by two (mechanistic interpretability) full factorial. Study 1 showed a pattern consistent with high (versus low) AI empathy lowering brand intimacy through reduced perceived authenticity. Study 2 replicated the AI-empathy backfire when interpretability was absent, reversed the sign of the AI-empathy slope on the perceived-authenticity mediator when interpretability was present, and neutralized the negative conditional indirect effect on brand intimacy through perceived authenticity. The findings introduce mechanistic interpretability to consumer-marketing scholarship as a manipulable signaling channel, document a structural reversal in the mediator-stage slope coupled with neutralization of the indirect effect on the relational outcome, and prescribe pairing empathic AI phrasing with mechanistic-transparency design rather than deploying empathy without an accompanying transparency cue.

When Algorithms Speak Louder than Empathy: Mechanistic Interpretability as a Costly Authenticity Signal in AI-Mediated E-Commerce Customer Relationships

Key Points

Abstract

Cite This Study