September 18, 2025Open Access

Building a Safe and Transparent Workflow for Large Language Model (LLM)-Assisted Clinical Trials and Prediction Models: A Technical Report

Key Points

The proposed workflow enhances efficiency in clinical trials while addressing concerns around privacy and accuracy.
Reusable checklists link study types to international reporting guidelines, promoting transparency and accountability.
Governance and technical safeguards aim to mitigate risks related to biased datasets and automated text reliance.
The framework supports human reasoning, ensuring responsible integration of large language models in clinical research.

Abstract

The use of large language models (LLMs) in clinical trials and prediction models is expanding rapidly, offering opportunities for efficiency but also raising concerns about privacy, fairness, accuracy, and accountability. This technical report proposes a structured workflow to support research teams in adopting LLMs while preserving scientific standards and public trust. The workflow is organized into seven sequential steps: (i) scope definition and governance, (ii) retrieval-augmented literature review, (iii) model evaluation and benchmarking, (iv) documentation and audit trail, (v) expert quality gates, (vi) manuscript disclosure, and (vii) privacy and security safeguards. To facilitate adoption, we provide reusable checklists that map study types to relevant international reporting guidelines, including Consolidated Standards of Reporting Trials - Artificial Intelligence (CONSORT-AI), Standard Protocol Items: Recommendations for Interventional Trials - Artificial Intelligence (SPIRIT-AI), Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis - Artificial Intelligence (TRIPOD+AI), Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA), and Developmental and Exploratory Clinical Investigations of Decision - Support Systems Driven by Artificial Intelligence (DECIDE-AI). The framework is designed to mitigate risks such as fabricated citations, biased outputs from skewed datasets, and over-reliance on automated text. Rather than replacing human reasoning, it aims to augment it, offering greater speed while maintaining accountability, reproducibility, and transparency. By combining governance rules, technical safeguards, and human oversight, this workflow provides a practical and auditable path for integrating LLMs into clinical trials and prediction models without eroding confidence in scientific work.

Building a Safe and Transparent Workflow for Large Language Model (LLM)-Assisted Clinical Trials and Prediction Models: A Technical Report

Key Points

Abstract

Cite This Study