What question did this study set out to answer?

To explore how large language models can assist patients in navigating emergency care and alleviate strains on EDs.

April 23, 2026Open Access

The Potential Role of Large Language Models in Assisting Patients and Guiding Emergency Care Visits

Key Points

To explore how large language models can assist patients in navigating emergency care and alleviate strains on EDs.
Identified 238 unique patient questions through a structured web search.
Selected 15 representative questions and submitted them to three large language models: ChatGPT, DeepSeek, and Gemini.
Clinical experts assessed the responses for quality and effectiveness.
ChatGPT was the best-performing model in 60% of cases.
Clarity received the highest positive rating (79%) among all response domains.
Over two-thirds of clinical raters were willing to integrate LLM tools into practice.

Abstract

Background/Objectives: Overcrowding in emergency departments (EDs) remains a critical challenge in modern healthcare systems, driven in part by patient uncertainty regarding symptom urgency and a lack of accessible medical guidance. Recent advances in artificial intelligence, particularly large language models (LLMs), present a novel opportunity to support patient navigation and relieve pressure on ED infrastructures. Methods: A total of 238 unique patient questions were identified through a structured web search. Following deduplication and thematic clustering, 15 representative questions were selected. Each question was submitted to the three LLMs—ChatGPT (v3.5), DeepSeek, and Gemini—using a standardized prompt. Responses were assessed by clinical experts (N = 8) who were blinded to the model source. Reviewers selected the best overall response per question, as well as the individual responses of the three LLMs for each respective question. Results: ChatGPT was selected as the best-performing model in 60% of cases, with DeepSeek and Gemini selected in 23% and 17%, respectively. ChatGPT responses also achieved the highest proportion of “excellent” quality ratings and the lowest proportion of “unsatisfactory” outputs. Across all models, clarity was the most positively rated domain (79% agreement), followed by empathy (72%), length/detail appropriateness (71%), and completeness (65%). Over two-thirds of raters expressed willingness to integrate LLM-based tools into clinical practice for patient education and pre-triage counseling. Conclusions: Large language models demonstrate promising capabilities in responding to emergency care-related patient queries. Their ability to deliver medically sound and communicatively effective answers positions them as potential digital adjuncts in the management of low-acuity ED presentations and prehospital triage.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Gerhardinger et al. (Tue,) studied this question.

synapsesocial.com/papers/69e9ba2a85696592c86ec6d5 https://doi.org/https://doi.org/10.3390/jcm15083170

Bookmark

View Full Paper