March 3, 2026

Evaluating AI Tools for COPD Management: NLP-BasedIdentification of Exacerbations in Local EHRs

Key Points

Results demonstrate classical machine learning models effectively identify COPD exacerbations in electronic health records.
LightGBM and CatBoost models achieve superior discrimination and calibration metrics, with an ECE of less than 1.3%.
Analysis utilized unstructured clinical texts from 5,924 outpatient notes labeled by a pulmonologist.
This study supports the viability of domain-adapted models for decision support in resource-limited settings.

Abstract

In this study we evaluate the performance of classical machine learning models, and transformer?based models (BioClinicalBERT and large language models (LLMs) for the automatic identification of COPD exacerbations using unstructured clinical texts from Colombian electronic health records (EHRs). The dataset included 5,924 outpatient notes written in Spanish and manually labeled by a pulmonologist, incorporating two key fields: “subjective” (patient-reported symptoms) and “analysis” (physician assessment). This work addresses a critical gap, as no prior models have been developed or validated for exacerbation detection in Spanish-language clinical texts. Results show that classical models—especially LightGBM and CatBoost—achieved the best balance between discrimination and calibration (ECE < 1.3%), outperforming BioClinicalBERT and LLMs. These findings support the use of supervised, domain-adapted models for decision support in resource-limited clinical settings

Bookmark

Evaluating AI Tools for COPD Management: NLP-BasedIdentification of Exacerbations in Local EHRs

Key Points

Abstract

Cite This Study