What is the clinical evidence from this study?

Study design: Cohort. Population: Clinical deterioration (n=146446). Intervention: Deep recurrent neural network deterioration model vs. Logistic regression model, NEWS, MEWS, and qSOFA. Primary outcome: 24-hour composite outcome of transfer to ICU or death (AUPRC 0.042, 95% CI 0.04-0.043).

March 12, 2021Open Access

A Simulated Prospective Evaluation of a Deep Learning Model for Real-Time Prediction of Clinical Deterioration Among Ward Patients*

Key Result

A deep recurrent neural network model for predicting transfer to ICU or death showed very poor performance (AUPRC 0.042; 95% CI 0.04-0.043), comparable to logistic regression and early warning scores.

Study Design

Type

Cohort (n=146,446)

Multicenter

Yes

Structured PICO

Does a deep recurrent neural network model improve the real-time prediction of clinical deterioration (transfer to ICU or death) in inpatient adults compared to standard early warning scores?

Population

146,446 hospitalizations of inpatient adults across four hospitals in Pennsylvania, evaluated for clinical deterioration.

Exposure

Deep recurrent neural network deterioration model and logistic regression model using electronic health record data to predict hourly the 24-hour composite outcome of transfer to ICU or death.

Comparator

National Early Warning Score (NEWS), Modified Early Warning Score (MEWS), and quick Sepsis-related Organ Failure Assessment (qSOFA).

Outcome

24-hour composite outcome of transfer to ICU or death.composite

Commonly used early warning scores and deep learning models show very poor performance for real-time prediction of clinical deterioration in ward patients when assessed using simulated prospective validation.

Main Result

Effect estimate: AUPRC 0.042 (95% CI 0.04-0.043)

Abstract

OBJECTIVES: The National Early Warning Score, Modified Early Warning Score, and quick Sepsis-related Organ Failure Assessment can predict clinical deterioration. These scores exhibit only moderate performance and are often evaluated using aggregated measures over time. A simulated prospective validation strategy that assesses multiple predictions per patient-day would provide the best pragmatic evaluation. We developed a deep recurrent neural network deterioration model and conducted a simulated prospective evaluation. DESIGN: Retrospective cohort study. SETTING: Four hospitals in Pennsylvania. PATIENTS: Inpatient adults discharged between July 1, 2017, and June 30, 2019. INTERVENTIONS: None. MEASUREMENTS AND MAIN RESULTS: We trained a deep recurrent neural network and logistic regression model using data from electronic health records to predict hourly the 24-hour composite outcome of transfer to ICU or death. We analyzed 146,446 hospitalizations with 16.75 million patient-hours. The hourly event rate was 1.6% (12,842 transfers or deaths, corresponding to 260,295 patient-hours within the predictive horizon). On a hold-out dataset, the deep recurrent neural network achieved an area under the precision-recall curve of 0.042 (95% CI, 0.04-0.043), comparable with logistic regression model (0.043; 95% CI 0.041 to 0.045), and outperformed National Early Warning Score (0.034; 95% CI, 0.032-0.035), Modified Early Warning Score (0.028; 95% CI, 0.027- 0.03), and quick Sepsis-related Organ Failure Assessment (0.021; 95% CI, 0.021-0.022). For a fixed sensitivity of 50%, the deep recurrent neural network achieved a positive predictive value of 3.4% (95% CI, 3.4-3.5) and outperformed logistic regression model (3.1%; 95% CI 3.1-3.2), National Early Warning Score (2.0%; 95% CI, 2.0-2.0), Modified Early Warning Score (1.5%; 95% CI, 1.5-1.5), and quick Sepsis-related Organ Failure Assessment (1.5%; 95% CI, 1.5-1.5). CONCLUSIONS: Commonly used early warning scores for clinical decompensation, along with a logistic regression model and a deep recurrent neural network model, show very poor performance characteristics when assessed using a simulated prospective validation. None of these models may be suitable for real-time deployment.

Mark Helpful

Bookmark

Relay

View Full Paper