Los puntos clave no están disponibles para este artículo en este momento.
Background: The widespread adoption of digital systems in healthcare has produced large volumes of unstructured text data, including outpatient messages sent through electronic medical record (EMR) portals. Efficient classification of these messages is essential for improving workflow automation and enabling timely clinical responses. Methods: This study investigates the use of large language models (LLMs) for classifying real-world outpatient messages collected from a healthcare system in central Illinois. We compare general-purpose (GPT-4o) and domain-specific (BioBERT and ClinicalBERT) models, evaluating both fine-tuned and few-shot configurations against a TF-IDF + Logistic Regression baseline. Experiments were performed under a HIPAA-compliant environment using de-identified and physician-labeled data. Results and Conclusions: Fine-tuned GPT-4o achieved 97.5% accuracy in urgency detection and 97.8% in full message classification, outperforming BioBERT and ClinicalBERT. These results demonstrate the feasibility and validity of applying modern LLMs to outpatient communication triage while ensuring both interpretability and privacy compliance.
Shifa et al. (Tue,) studied this question.