What question did this study set out to answer?

The study aims to improve early sepsis prediction in ICUs using machine learning while ensuring model explainability.

March 26, 2026Open Access

Explainable Machine Learning for Multi-Horizon Early Sepsis Prediction in Intensive Care Units

Key Points

The study aims to improve early sepsis prediction in ICUs using machine learning while ensuring model explainability.
Utilized the PhysioNet dataset with 40,336 patient records and 1.42 million clinical observations.
Applied logistic regression, random forest, and gradient boosting algorithms for prediction.
Assessed prediction capability at three time horizons: onset, 3 hours prior, and 6 hours prior.
Evaluated model performance using AUROC and calibration metrics like sensitivity and Brier score.
Incorporated explainable AI techniques to identify predictors influencing model outputs.
Gradient boosting achieved the highest predictive performance with an AUROC of 0.89 at sepsis onset.
Maintained strong discrimination even at earlier prediction windows.
Explainability analysis revealed physiologically relevant predictors, aligning with sepsis pathophysiology.

Abstract

Sepsis prediction in intensive care units (ICUs) remains a major clinical challenge because delayed recognition substantially increases mortality and treatment complexity. This study proposes a machine learning-driven framework for early sepsis prediction that integrates structured preprocessing of ICU time-series data, multi-horizon risk modeling, and explainable prediction analysis. Using the publicly available PhysioNet/Computing in Cardiology 2019 Sepsis Challenge dataset, comprising 40,336 ICU patient records from two hospital systems and approximately 1.42 million hourly clinical observations, this study evaluates early sepsis prediction under substantial class imbalance, with septic cases representing approximately 7.3% of the dataset. Three widely used machine learning algorithms - logistic regression, random forest, and gradient boosting - were implemented to establish baseline predictive performance. The framework evaluates prediction capability at three clinically relevant time horizons: at sepsis onset, 3 hours prior to onset, and 6 hours prior to onset, enabling systematic assessment of early warning capability. Model performance was assessed using discrimination and calibration metrics, including area under the receiver operating characteristic curve (AUROC), area under the precision-recall curve, sensitivity at fixed specificity, and Brier score. To support clinical transparency, explainable artificial intelligence techniques based on SHapley Additive exPlanations were incorporated to identify both global and patient-level predictors influencing model outputs. Results show that gradient boosting consistently achieved the strongest predictive performance across all prediction horizons, achieving an AUROC of 0.89 at sepsis onset while maintaining clinically meaningful discrimination at earlier prediction windows. Explainability analysis highlights physiologically relevant predictors consistent with established sepsis pathophysiology. These findings demonstrate that ensemble-based machine learning models can provide accurate, calibrated, and interpretable early sepsis prediction using routinely collected ICU data, supporting the development of reliable clinical decision-support tools for timely identification of high-risk patients.

Bookmark

View Full Paper

Cite This Study

Ogunwale et al. (Tue,) studied this question.

synapsesocial.com/papers/69c4cd5afdc3bde4489199b2 https://doi.org/https://doi.org/10.7759/s44389-026-00045-7

Bookmark

View Full Paper