Abstract This systematic review synthesizes currently available empirical evidence on generative artificial intelligence (GenAI) tools for drafting responses to patient messages. Across a total of 23 studies identified, GenAI was found to produce empathetic replies with quality comparable to that of responses drafted by human experts, demonstrating its potential to facilitate patient–provider communication and alleviate clinician burnout. Challenges include inconsistent performance, risks to patient safety, and ethical concerns around transparency and oversight. Additionally, utilization of the technology remains limited in real-world settings, and existing evaluation efforts vary greatly in study design and methodological rigor. As this field evolves, there is a critical need to establish robust and standardized evaluation frameworks, develop practical guidelines for disclosure and accountability, and meaningfully engage clinicians, patients, and other stakeholders. This review may provide timely insights into informing future research of GenAI and guiding the responsible integration of this technology into day-to-day clinical work.
Hu et al. (Wed,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: