October 5, 2016

Comparison of Approaches for Heart Failure Case Identification From Electronic Health Record Data

Q: What is the clinical evidence from this study?

Study design: Observational. Population: Heart failure (n=47119). Intervention: Machine learning algorithms using structured and unstructured EHR data vs. Problem list diagnosis. Primary outcome: Area under the receiver operating curve (AUC) for heart failure identification (Algorithm 5).

Key Result

A machine learning approach using both structured and unstructured electronic health record data accurately identified hospitalized patients with heart failure, achieving an AUC of 0.974.

Study Design

Type

Observational (n=47,119)

Multicenter

PICO

Population

Heart failure (n=47,119)

Intervention / Comparator

Machine learning algorithms using structured and unstructured EHR data vs Problem list diagnosis

Primary Outcome

Area under the receiver operating curve (AUC) for heart failure identification (Algorithm 5)

Limitations

Discharge diagnosis codes are subject to misclassification and upcoding
Imperfect reliability among physicians for chart review and potential sampling bias
Single institution study, limiting generalizability
May have missed potential contraindications to quality metrics
Algorithms were not validated on outpatients

Abstract

IMPORTANCE: Accurate, real-time case identification is needed to target interventions to improve quality and outcomes for hospitalized patients with heart failure. Problem lists may be useful for case identification but are often inaccurate or incomplete. Machine-learning approaches may improve accuracy of identification but can be limited by complexity of implementation. OBJECTIVE: To develop algorithms that use readily available clinical data to identify patients with heart failure while in the hospital. DESIGN, SETTING, AND PARTICIPANTS: We performed a retrospective study of hospitalizations at an academic medical center. Hospitalizations for patients 18 years or older who were admitted after January 1, 2013, and discharged before February 28, 2015, were included. From a random 75% sample of hospitalizations, we developed 5 algorithms for heart failure identification using electronic health record data: (1) heart failure on problem list; (2) presence of at least 1 of 3 characteristics: heart failure on problem list, inpatient loop diuretic, or brain natriuretic peptide level of 500 pg/mL or higher; (3) logistic regression of 30 clinically relevant structured data elements; (4) machine-learning approach using unstructured notes; and (5) machine-learning approach using structured and unstructured data. MAIN OUTCOMES AND MEASURES: Heart failure diagnosis based on discharge diagnosis and physician review of sampled medical records. RESULTS: A total of 47 119 hospitalizations were included in this study (mean SD age, 60.9 18.15 years; 23 952 female 50.8%, 5258 black/African American 11.2%, and 3667 Hispanic/Latino 7.8% patients). Of these hospitalizations, 6549 (13.9%) had a discharge diagnosis of heart failure. Inclusion of heart failure on the problem list (algorithm 1) had a sensitivity of 0.40 and a positive predictive value (PPV) of 0.96 for heart failure identification. Algorithm 2 improved sensitivity to 0.77 at the expense of a PPV of 0.64. Algorithms 3, 4, and 5 had areas under the receiver operating characteristic curves of 0.953, 0.969, and 0.974, respectively. With a PPV of 0.9, these algorithms had associated sensitivities of 0.68, 0.77, and 0.83, respectively. CONCLUSIONS AND RELEVANCE: The problem list is insufficient for real-time identification of hospitalized patients with heart failure. The high predictive accuracy of machine learning using free text demonstrates that support of such analytics in future electronic health record systems can improve cohort identification.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Blecker et al. (Wed,) conducted a observational in Heart failure (n=47,119). Machine learning algorithms using structured and unstructured EHR data vs. Problem list diagnosis was evaluated on Area under the receiver operating curve (AUC) for heart failure identification (Algorithm 5). A machine learning approach using both structured and unstructured electronic health record data accurately identified hospitalized patients with heart failure, achieving an AUC of 0.974.

synapsesocial.com/papers/6a156307814bf8ec9a4e827f — DOI: https://doi.org/10.1001/jamacardio.2016.3236

Authors

Saul Blecker

Heart Failure / Cardiomyopathy

Stuart D. Katz

Heart Failure & Transplant

Leora I. Horwitz

Heart Failure & Transplant

Journals

JAMA Cardiology

Actions

Institutions

New York University

NewYork–Presbyterian Hospital

New York Hospital Queens

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Comparison of Approaches for Heart Failure Case Identification From Electronic Health Record Data

Key Result

Study Design

PICO

Limitations

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion