Electronic Health Records (EHRs) provide a rich, longitudinal view of patient health and hold significant potential for advancing clinical decision support, risk prediction, and data-driven healthcare research. However, most artificial intelligence (AI) models for EHRs are designed for narrow, single-purpose tasks, limiting their generalizability and utility in real-world settings. Here, we present CEHR-XGPT, a general-purpose foundation model for EHR data that unifies three essential capabilities - feature representation, zero-shot prediction, and synthetic data generation - within a single architecture. To support temporal reasoning over clinical sequences, CEHR-XGPT incorporates a novel time-token-based learning framework that explicitly encodes patients' dynamic timelines into the model structure. CEHR-XGPT demonstrates strong performance across all three tasks and generalizes effectively to external datasets through vocabulary expansion and fine-tuning. Its versatility enables rapid model development, cohort discovery, and patient outcome forecasting without the need for task-specific retraining.
Building similarity graph...
Analyzing shared references across papers
Loading...
Chao Pang
Ghent University
Jiheum Park
Columbia University Irving Medical Center
Xinzhuo Jiang
Columbia University Irving Medical Center
Building similarity graph...
Analyzing shared references across papers
Loading...
Pang et al. (Wed,) studied this question.
synapsesocial.com/papers/68e02f40f0e39f13e7fa299e — DOI: https://doi.org/10.48550/arxiv.2509.03643