Accurate differentiation of common hematologic disorders remains challenging in routine clinical practice and often requires invasive diagnostic procedures. Although complete blood count (CBC) testing is widely available, its diagnostic value for early disease triage has not been fully understood. Retrospectively, among 165,181 routine blood test records collected between October 2011 and June 2025, 4,056 samples with confirmed diagnoses were included for model development and validation after exclusion of cases lacking definitive diagnostic information. Patients were classified into aplastic anemia (AA), immune thrombocytopenia (ITP), myelodysplastic syndrome (MDS), and other hematologic conditions. Machine learning models were developed using routinely available CBC parameters. Model performance was assessed using one-vs-rest receiver operating characteristic (ROC) curves, area under the curve (AUC), and class-specific precision, recall, and F1-scores. Model interpretability was evaluated using Shapley Additive exPlanations (SHAP). Baseline demographic and hematologic parameters differed significantly among diagnostic groups (all P < 0.001). Among the evaluated models, LightGBM demonstrated robust overall performance, achieving one-vs-rest AUCs of 0.920 for AA, 0.970 for ITP, 0.788 for MDS, and 0.870 for other conditions, with an overall accuracy of 0.82. While AA and ITP were identified with favorable precision and recall, MDS showed lower recall, reflecting substantial overlap in routine laboratory features. SHAP analysis identified platelet count, red blood cell count, and white blood cell count as the most influential predictors. A machine learning model based on routinely available CBC parameters can support non-invasive differentiation of common hematologic disorders. This approach may serve as a practical screening and triage tool at the outpatient or pre-bone marrow stage, helping optimize the use of invasive diagnostic procedures.
Wang et al. (Sun,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: