What question did this study set out to answer?

To investigate how patient age influences risk prediction models in early warning systems for clinical deterioration.

March 22, 2026Open Access

Balancing Model Performance With Operational Realities in Early Warning Systems—Complexity Where It Matters

Key Points

To investigate how patient age influences risk prediction models in early warning systems for clinical deterioration.
Comparison of early warning scores in patients aged 80 years and older
Analysis of area under the receiver operating characteristic curve for different scores
Variable importance analysis of model inputs, focusing on interactions with age
The Rapid Emergency Medicine Score discriminated better for patients over 94 years
General decrease in model accuracy with increasing age for most scores
Age impacts the importance of certain physiological inputs like oxygen and blood pressure

Abstract

Risk prediction has been a cornerstone of efforts to prevent clinical deterioration in hospitalized patients.However, despite guideline recommendations for use of early warning scores (EWSs) as part of a rapid response system, substantial questions remain about which scores to use and how best to implement them to reduce in-hospital cardiac arrest and mortality. 1The study by Covino and colleagues 2 adds to those questions by demonstrating that patient age impacts not only the discrimination and calibration of different models but also the weighting of physiologic inputs to the model.In their study comparing EWSs in emergency department patients aged 80 years and older, the authors demonstrated that while the area under the receiver operating characteristic curve for predicting intensive care unit admission or death decreased slightly with increasing age for most scores, the Rapid Emergency Medicine Score (REMS), 1 of only 2 of the tested models that included age, discriminated better in patients older than 94 years and was also the best calibrated of the scores.In addition, they included a variable importance analysis demonstrating several interactions between age and other model inputs, with supplemental oxygen and systolic blood pressure, for example, having more of an impact on risk in patients older than 86 years.This study 2 serves as an important reminder that age is both a key covariate and a potential confounder in clinical deterioration prediction, raising the question of why its inclusion is not more ubiquitous in EWSs.Age is a well-known predictor of mortality, which the developers of the National Early Warning Score (NEWS) were keenly aware of and had previously published on, 3 suggesting that the decision to omit age from NEWS was a methodologic one rather than evidence based.Furthermore, these findings echo broader work comparing machine learning-based EWSs with traditional scores, where models that implicitly account for interactions often achieve higher discrimination. 4aditional EWSs, such as the Modified Early Warning Score and NEWS, were intentionally simple, relying on fixed thresholds, additive points, and transparent interpretation, features that were essential in the era of paper charting and manual calculation.In contrast, modern artificial intelligence and machine learning models add complexity in calculation and interpretability but can be trained to account for interactions between predictors, reflecting the clinical reality that physiology rarely behaves linearly or independently.More importantly, prior work suggests that the success or failure of early warning systems may depend less on model performance than on how predictions are operationalized.Studies of rapid response systems have demonstrated that even well-validated scores may fail to reduce adverse outcomes when alerts are poorly timed, poorly targeted, or lack clear response pathways. 5terpreted literally, the findings from Covino et al 2 could be used to argue that hospitals should use NEWS in younger emergency department patients and REMS in older ones.That approach would retain model simplicity while accounting for age-specific interactions.However, it would also introduce substantial complexity for frontline clinicians, who would need to be familiar with multiple scores and workflows.In this context, adding multiple age-stratified scores may paradoxically worsen performance by increasing cognitive burden and reducing adherence, even if each individual model is statistically optimized for its subgroup.

Bookmark

View Full Paper

Cite This Study

Dana P. Edelson (Thu,) studied this question.

synapsesocial.com/papers/69bf8692f665edcd009e8f3b https://doi.org/https://doi.org/10.1001/jamanetworkopen.2026.1497

Bookmark

View Full Paper