Abstract Rationale Extubation failure, defined as the need for re-intubation or death within 48 hours of extubation, is associated with prolonged ICU stay and higher mortality. Early identification of high-risk patients could enable timely interventions and improve outcomes. We aimed to develop and internally validate a machine learning model to predict extubation failure using a large, publicly available critical-care database. Methods Data were extracted from the eICU Collaborative Research Database Demo, including demographics (age, gender, ethnicity), clinical severity scores (acute physiology and APACHE scores), and predicted ICU mortality. Extubation failure was defined as re-intubation or death within 48 hours after extubation. The dataset was randomly divided into a training cohort (80%) and an internal validation cohort (20%). A random-forest classifier was trained using clinical and physiologic variables, and model performance on the validation cohort was assessed using accuracy, area under the receiver-operating characteristic curve (AUC), sensitivity, specificity, and F1-score. Results A total of 1,815 extubation events were analyzed, comprising 1,452 in the training cohort and 363 in the internal validation cohort. After removal of non-clinical identifiers, the random-forest model demonstrated strong performance in predicting extubation failure within 48 hours. On the validation cohort, the model achieved an accuracy of 0.88, AUC 0.84, sensitivity 0.95, specificity 0.61, and F1-score 0.93. Feature-importance analysis identified predicted ICU mortality, APACHE score, acute physiology score, and age as the most influential predictors of extubation failure. Compared with prior models requiring numerous laboratory inputs, this approach uses a multicenter dataset and a minimal, routinely available feature set to enhance interpretability and clinical applicability. Conclusions A random-forest model using routinely available ICU data accurately predicted extubation failure within 48 hours, showing robust internal validation performance. Machine learning-based prediction tools could support early risk stratification and clinical decision-making in critical-care practice. Future studies should focus on external validation across larger multicenter datasets. This abstract is funded by: None
Doshi et al. (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: