July 23, 2025Open Access

A systematic review of early evidence on generative AI for drafting responses to patient messages

Key Points

Core finding reveals that generative AI produces empathetic replies similar in quality to those by human experts.
Analysis of 23 studies showcases GenAI's role in potentially reducing clinician burnout through improved patient responses.
Challenges include inconsistent performance and ethical concerns, raising questions about patient safety and transparency.
Emphasis on establishing standardized evaluation frameworks for meaningful integration of GenAI into clinical practice.

Abstract

Abstract This systematic review synthesizes currently available empirical evidence on generative artificial intelligence (GenAI) tools for drafting responses to patient messages. Across a total of 23 studies identified, GenAI was found to produce empathetic replies with quality comparable to that of responses drafted by human experts, demonstrating its potential to facilitate patient–provider communication and alleviate clinician burnout. Challenges include inconsistent performance, risks to patient safety, and ethical concerns around transparency and oversight. Additionally, utilization of the technology remains limited in real-world settings, and existing evaluation efforts vary greatly in study design and methodological rigor. As this field evolves, there is a critical need to establish robust and standardized evaluation frameworks, develop practical guidelines for disclosure and accountability, and meaningfully engage clinicians, patients, and other stakeholders. This review may provide timely insights into informing future research of GenAI and guiding the responsible integration of this technology into day-to-day clinical work.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Hu et al. (Wed,) studied this question.

synapsesocial.com/papers/689a0621e6551bb0af8cdd5e https://doi.org/https://doi.org/10.1038/s44401-025-00032-5

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Bookmark

View Full Paper