What question did this study set out to answer?

This research aims to develop a multimodal critical care dataset to enhance precision in patient care.

March 26, 2026

1713: The Initial 50,000 Icu Admissions Within the Chorus Bridge2ai Multimodal Dataset

Key Points

This research aims to develop a multimodal critical care dataset to enhance precision in patient care.
Included a random sample from adult, pediatric, and neonatal ICU admissions.
Data was partitioned into training, test, and hold-out sets to ensure robustness.
EHR data was mapped to the OMOP Common Data Model with extensions for ICU-specific concepts.
Utilized OHDSI tools for analyzing variations in ICU practices.
Accrued 50,637 ICU admissions, with 1.6 billion rows of EHR data.
Collected 23 Terabytes of waveform data and initial radiology data for 7,642 patients.
Notable diagnoses included acute kidney injury in 9,491 patients and sepsis in 8,880.
21.4% of recorded admissions resulted in death, supporting continuous quality improvement efforts.

Abstract

Introduction: Precision critical care requires large and representative data. We developed a multi-center, multimodal, high-resolution critical care dataset including electronic health record (EHR), imaging, and waveform data. We report the composition of the initial 50,000 patient admissions and lessons learned. Methods: We included a random sample of adults, pediatric and neonatal intensive care unit (ICU) admissions. Data partitions specified 60% training, 20% test, and 20% hold-out sets to enhance robustness and facilitate regulatory compliance (e.g. FDA filings). The CHoRUS Trusted Research Environment (TRE) and AI/ML workspace were developed on cloud to promote collaborative computing and privacy. EHR data included thousands of lab, med, nurse flowsheet data, diagnoses, and notes, mapped to the Observational Medical Outcomes Partnership (OMOP) Common Data Model, extended for custom ICU concepts and linkages to imaging (DICOM) and waveforms (WFDB). OHDSI tools (ATLAS) enabled comparing variations in ICU practice patterns. Results: As of 7/1/25, the CHoRUS dataset accrued 50,637 ICU admissions: 1.6 billion rows of OMOP EHR data, 23 Tb of waveforms, and initial radiology data for 7,642 patients, representing ICD-10 diagnoses for acute kidney injury (9,491 patients), sepsis (8,880), shock (5,221), trauma (3,068), ARDS (951), acute MI (2,264), pulmonary embolism (1,677), subarachnoid hemorrhage (1,637), subdural hemorrhage (2,661), and intracranial hemorrhage (1,053). No site exceeded 18% of the cohort. Race and ethnicity varied by hospital: 35% Black and 30% Hispanic. Age distribution was bimodal, peaking for neonates and ages 60-70 years. Death was recorded in 21.4%. The CHoRUS TRE had onboarded 226 active user accounts including an NIH AI training program (AIM-AHEAD) and 3 datathons, yielding iterative data quality improvement cycles. The TRE hosts AI/ML tools for data labeling and model development/validation. Conclusions: The Bridge2AI CHoRUS critical care dataset offers multimodal, multi speciality, high-resolution data in a trusted research environment to support generalizable clinical care AI and comparative effectiveness research. Details are available at www.github.com/chorus-ai.

Bookmark

1713: The Initial 50,000 Icu Admissions Within the Chorus Bridge2ai Multimodal Dataset

Key Points

Abstract

Cite This Study