Abstract Rationale The clinical identification of hypercapnic respiratory failure (HRF) is challenging because it presents in multiple ways and often coexists with chronic or systemic illness. The literature lacks quantitative data describing how patients with hypercapnia tend to present to the hospital. Characterizing presenting symptoms using natural language processing (NLP) could clarify the clinical spectrum of HRF and improve both diagnostic recognition and epidemiologic characterization. Methods This retrospective study used the Medical Information Market for Intensive Care (MIMIC-IV) database, which contains hospitalizations and emergency room (ER) visits at Beth Israel Deaconess Medical Center 2011-2019. Encounters were included if any of the following were present: hypercapnic respiratory failure (ICD-10 J96.x2) or obesity hypoventilation syndrome (E66.2) diagnosis codes, an arterial blood gas (ABG) PaCO2 ≥ 45 mmHg, or venous blood gas (VBG) PCO2 ≥ 50 mmHg and they had documented ER chief complaint. Chief complaints were normalized (abbreviation expansion, contextual disambiguation, conservative spelling correction, and lemmatization) and embeddings generated using a transformer model (bioclinical-modernBERT). Chief complaints were mapped to one or more of the 18 National Hospital Ambulatory Medical Care Survey (NHAMCS) “Reason for Visit” categories using nearest-neighbor mapping and rule-based overrides for key terms (e.g., dyspnea, altered mental status). In 160 adjudicated cases, agreement with two reviewers was moderate (mean F1 = 0.67, κ = 0.50; 54% exact, 21% partial matches). Results We identified 27,459 encounters (median age 65 years, 52.9% male). Ascertainment criteria (non-exclusive) were ICD-codes in 7.2% (n = 1983), ABG in 61.2% (n = 16,806), and VBG in 63.5% (n = 17,422) of the cohort. By NHAMCS category, respiratory system (21.8%), nervous system (13.8%) and digestive system (12.4%) were most common. Conversely, in patients under 40 years old, “injuries and adverse effects” were the most common presenting classification. Respiratory complaints predominated among ICD-coded cases (51%) but were lower in ABG-defined (22.7%) and VBG-defined (26.5%) cohorts. Figure 1 shows category distributions by ascertainment route. Conclusion NLP characterization of presenting symptoms reveals that hypercapnic respiratory failure encompasses a broad clinical spectrum. While respiratory complaints predominate, particularly among ICD-defined cases, many patients meeting blood-gas criteria present for reasons other than dyspnea or confusion. These findings suggest that hypercapnia may frequently occur as an epiphenomenon or nonspecific marker of disease severity; a distinction crucial in estimating attributable risk and targeting interventions. This abstract is funded by: None
Merdad et al. (Fri,) studied this question.