Background/Objectives: Overcrowding in emergency departments (EDs) remains a critical challenge in modern healthcare systems, driven in part by patient uncertainty regarding symptom urgency and a lack of accessible medical guidance. Recent advances in artificial intelligence, particularly large language models (LLMs), present a novel opportunity to support patient navigation and relieve pressure on ED infrastructures. Methods: A total of 238 unique patient questions were identified through a structured web search. Following deduplication and thematic clustering, 15 representative questions were selected. Each question was submitted to the three LLMs—ChatGPT (v3.5), DeepSeek, and Gemini—using a standardized prompt. Responses were assessed by clinical experts (N = 8) who were blinded to the model source. Reviewers selected the best overall response per question, as well as the individual responses of the three LLMs for each respective question. Results: ChatGPT was selected as the best-performing model in 60% of cases, with DeepSeek and Gemini selected in 23% and 17%, respectively. ChatGPT responses also achieved the highest proportion of “excellent” quality ratings and the lowest proportion of “unsatisfactory” outputs. Across all models, clarity was the most positively rated domain (79% agreement), followed by empathy (72%), length/detail appropriateness (71%), and completeness (65%). Over two-thirds of raters expressed willingness to integrate LLM-based tools into clinical practice for patient education and pre-triage counseling. Conclusions: Large language models demonstrate promising capabilities in responding to emergency care-related patient queries. Their ability to deliver medically sound and communicatively effective answers positions them as potential digital adjuncts in the management of low-acuity ED presentations and prehospital triage.
Gerhardinger et al. (Tue,) studied this question.