What question did this study set out to answer?

The review aims to evaluate the methodology and reporting quality of studies using machine learning for weaning and extubation predictions.

April 17, 2026Open Access

Machine learning for prediction of weaning and extubation from mechanical ventilation: a systematic review of methodology, reporting and bias

Key Points

The review aims to evaluate the methodology and reporting quality of studies using machine learning for weaning and extubation predictions.
Systematic search conducted across MEDLINE, Embase, and PubMed.
Included prospective or retrospective studies focused on machine learning applications.
Data extracted using the TRIPOD+AI checklist for methodological quality assessment.
Risk of bias assessed with the Prediction model Risk Of Bias Assessment Tool+AI.
1245 studies identified, with 40 included in the final review.
90% were retrospective and 85% lacked external validation.
Logistic regression, random forest, and XGBoost were the most utilized machine learning methods.
Only 35% reported model calibration, while 13% included net benefit analysis.
83% of studies showed high risk of bias in at least one assessment domain.

Abstract

Objective Systematic review to assess methodology and quality of reporting for studies applying machine learning (ML) to develop prediction models for weaning and extubation from invasive mechanical ventilation. Methods and analysis A protocol was registered (PROSPERO CRD420250651389), and a search strategy was developed for MEDLINE (Ovid), Embase and PubMed (1 January 2015–19 February 2025). Prospective or retrospective studies using ML to predict weaning or extubation from invasive mechanical ventilation for adults and children were included; preprints or studies assessing non-invasive ventilation were excluded. Search results were independently screened, and data extracted into proforma. Data were collected on methodological approaches, using the Transparent Reporting of a multivariable model for Individual Prognosis or Diagnosis+Artificial Intelligence (TRIPOD+AI) checklist as a framework. Risk of bias was assessed using the Prediction model Risk Of Bias Assessment Tool+Artificial Intelligence tool. Results were presented descriptively or summarised using tables or charts. Results 1245 studies were identified, and 40 studies were included in the final review; these were predominantly retrospective (90%), single centre and lacked external validation (85%). Logistic regression (50%), random forest (50%) and XGBoost (45%) were the most used ML architectures. There was wide variation and inconsistent reporting of data preprocessing, management of missing data and feature selection. There was significant heterogeneity in outcome definition, with limited use of consensus criteria. Most did not incorporate time series data, using mean or last values within a feature window. While model discrimination was universally reported (100%), calibration (35%) and net benefit analysis (13%) were not. Interpretability was demonstrated using post hoc metrics, such as SHapley Additive exPlanations (43%), that align poorly with clinical reasoning. Few (20%) demonstrated clinical implementation. 83% of included studies were classified as high risk of bias in at least one domain. Conclusion This systematic review of 40 studies has demonstrated methodological and reporting flaws, with a high risk of bias in over 80% in at least one domain. Future work should, where possible, use prospective, multicentre data and externally validate their findings; report design and performance guided by TRIPOD+AI guidelines; use consensus-based criteria to enable comparison between studies; use architectures that leverage time-series data; align interpretability to specific downstream tasks; and engage clinician-end users in model development. PROSPERO registration number CRD420250651389.

Machine learning for prediction of weaning and extubation from mechanical ventilation: a systematic review of methodology, reporting and bias

Key Points

Abstract

Cite This Study