An XGBoost machine learning model using routine laboratory data identified multiple myeloma in chronic kidney disease patients with 95.1% accuracy and a ROC-AUC of 0.962.
Does a machine learning model using routine laboratory parameters accurately detect multiple myeloma in patients with chronic kidney disease?
A machine learning model using routine laboratory data can accurately identify multiple myeloma in patients with chronic kidney disease, potentially reducing diagnostic delays.
Absolute Event Rate: 0% vs 0%
Abstract Background: Multiple myeloma (MM) and chronic kidney disease (CKD) share similar clinical manifestations including renal impairment and anemia, making MM diagnosis challenging in general and nephrology clinics. Many CKD patients harbor undiagnosed MM, but healthcare insurance restrictions and public health system constraints limit ordering of specialized tests (serum protein electrophoresis, free light chains, immunoglobulins). This diagnostic delay can result in irreversible renal damage and delayed treatment. We developed a machine learning model using routine laboratory parameters to identify CKD patients requiring MM-specific workup or hematology referral. Methods: We analyzed 4,759 CKD patients (591 MM, 4,168 non-MM controls; 12.4% prevalence) from a tertiary center. Using only routine tests available in general practice, we engineered features including CBC (hemoglobin, WBC, platelets), biochemistry (total protein, albumin, calcium, LDH, creatinine), and urine studies (protein, albumin, creatinine). Three feature sets were tested: RANK1 (20 features), RANK2 (16 features), and RANK3 (4 minimal features). We compared XGBoost, Random Forest, and Logistic Regression using stratified 5-fold cross-validation. with both unbalanced and SMOTEENN-resampled data. Results: XGBoost achieved optimal performance: accuracy 95.1%, F1-score 0.798, ROC-AUC 0.962, PR-AUC 0.889. The minimal 4-feature model (platelet, WBC, urine protein, LDH) maintained clinically useful performance (F1 0.596, PR-AUC 0.627). SHAP analysis revealed hemoglobin/WBC ratio, platelet count, albumin/total protein ratio, and total protein as top predictors, capturing MM's characteristic hematologic suppression and paraproteinemia without requiring specialized assays. Dimensionality reduction visualization (PCA, UMAP, t-SNE) demonstrated distinct clustering patterns between MM and non-MM CKD patients, confirming inherent feature space separability despite overlapping clinical presentations. Conclusions: This ML-based screening tool demonstrates potential for identifying high-risk CKD patients warranting MM-specific testing using only routine laboratory data. However, this retrospective single-center study has limitations including potential selection bias and lack of external validation. Prospective multicenter validation studies are needed to assess real-world clinical utility, optimal decision thresholds, and impact on diagnostic timeliness before implementation in clinical practice. Citation Format: Wen-Chi Wu, Muh-Hwa Yang, Chih-Yu Yang, . Machine learning-based screening tool for multiple myeloma detection in chronic kidney disease patients with limited laboratory testing abstract. In: Proceedings of the American Association for Cancer Research Annual Meeting 2026; Part 1 (Regular Abstracts); 2026 Apr 17-22; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2026;86(7 Suppl):Abstract nr 2716.
Wu et al. (Fri,) reported a other. An XGBoost machine learning model using routine laboratory data identified multiple myeloma in chronic kidney disease patients with 95.1% accuracy and a ROC-AUC of 0.962.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: