Purpose: Large language models (LLMs) improve medical writing efficiency but introduce methodological, ethical, and legal risks. This review examines current evidence on the limitations of LLM-assisted medical writing and proposes principles for its responsible integration into biomedical research.Current concepts: LLMs are commonly used for draft generation, language editing, literature summarization, reference handling, statistical code generation, and manuscript structuring. Studies consistently report improved readability and reduced writing time, particularly among non-native English-speaking authors. However, recurrent challenges include factual hallucinations; fabricated or inaccurate citations; incomplete retrieval of recent literature due to training cutoffs; prompt-sensitive statistical errors; ambiguity regarding authorship and accountability; risks of unintended plagiarism; and concerns related to patient data privacy. These limitations arise from the probabilistic nature of LLMs and their lack of intrinsic fact-verification mechanisms or ethical reasoning.Discussion and conclusion: Risks associated with LLM use vary by manuscript stage and therefore require differentiated oversight. LLMs should be confined primarily to language refinement rather than fact generation, and literature-related outputs must be verified against primary sources, preferably using retrieval-augmented tools. Statistical analyses should remain under human control, with independent validation of all outputs. Ethical governance requires transparent disclosure of LLM use, clear assignment of human responsibility, and strict safeguards for sensitive data. A dual framework combining human-in-the-loop and human-on-the-loop oversight offers a pragmatic model for balancing efficiency with scientific rigor. When positioned as augmentative tools rather than autonomous agents, LLMs can be responsibly integrated into medical research without compromising integrity or reproducibility.
Ki-Hyun Jeon (Tue,) studied this question.