Sepsis prediction in intensive care units (ICUs) remains a major clinical challenge because delayed recognition substantially increases mortality and treatment complexity. This study proposes a machine learning-driven framework for early sepsis prediction that integrates structured preprocessing of ICU time-series data, multi-horizon risk modeling, and explainable prediction analysis. Using the publicly available PhysioNet/Computing in Cardiology 2019 Sepsis Challenge dataset, comprising 40,336 ICU patient records from two hospital systems and approximately 1.42 million hourly clinical observations, this study evaluates early sepsis prediction under substantial class imbalance, with septic cases representing approximately 7.3% of the dataset. Three widely used machine learning algorithms - logistic regression, random forest, and gradient boosting - were implemented to establish baseline predictive performance. The framework evaluates prediction capability at three clinically relevant time horizons: at sepsis onset, 3 hours prior to onset, and 6 hours prior to onset, enabling systematic assessment of early warning capability. Model performance was assessed using discrimination and calibration metrics, including area under the receiver operating characteristic curve (AUROC), area under the precision-recall curve, sensitivity at fixed specificity, and Brier score. To support clinical transparency, explainable artificial intelligence techniques based on SHapley Additive exPlanations were incorporated to identify both global and patient-level predictors influencing model outputs. Results show that gradient boosting consistently achieved the strongest predictive performance across all prediction horizons, achieving an AUROC of 0.89 at sepsis onset while maintaining clinically meaningful discrimination at earlier prediction windows. Explainability analysis highlights physiologically relevant predictors consistent with established sepsis pathophysiology. These findings demonstrate that ensemble-based machine learning models can provide accurate, calibrated, and interpretable early sepsis prediction using routinely collected ICU data, supporting the development of reliable clinical decision-support tools for timely identification of high-risk patients.
Ogunwale et al. (Tue,) studied this question.