Abstract Diagnosis omission in discharge diagnosis lists is common in electronic medical records (EMRs), leading to inaccurate documentation, incorrect Diagnosis Related Group (DRG) assignments, and reduced reimbursements from overlooked Complications and Comorbidities (CC) or Major Complications and Comorbidities (MCC). To address this, we propose a data and knowledge cross-level fusion-driven learning framework for automated identification of missed diagnoses. Evaluated on real-world EMRs from six hospitals across various provinces in China, our model outperforms expert system method, BERT-based method, and multiple LLM-based baseline methods, demonstrating superior F1 scores. Results show 37.8% of EMRs predicted to have missed diagnoses, with 9.0% experiencing altered DRG groupings, subsequently affecting 3.2% of insurance reimbursement. To minimize alert fatigue, we adopted a hybrid approach combining our model with expert system, boosting precision by 6.7–13.4%. We also designed two human-machine coupling modes to demonstrate the utility of our methods in the real world.
Liu et al. (Thu,) studied this question.