Abstract Artificial intelligence (AI) chatbots are increasingly used for self-triage and medical advice seeking. Accurate AI performance, however, hinges on how users interact with such consumer-facing applications. While previous research has identified reservations regarding AI-generated medical advice, earlier stages of human–AI interaction, such as how symptoms are communicated, remain largely unexplored. In a preregistered between-subject experiment ( n = 500), participants were randomly assigned to provide simulated symptom reports for common medical conditions to either an AI chatbot or a human physician. We evaluated the quality of the reports for an initial medical urgency assessment using physician-validated large language model-based suitability metrics. Participants who believed they were interacting with an AI tool (versus a physician) provided lower-quality symptom reports for medical triage. Our findings indicate a bias in how users communicate symptoms in digital settings. This outcome could compromise the performance of consumer-facing AI tools in real-world applications, regardless of the underlying model’s actual capacity.
Reis et al. (Fri,) studied this question.