This study aimed to develop and validate a biomarker-based prediction model for assessing the individual risk of coronary artery lesions (CAL) in Kawasaki disease (KD). A retrospective analysis was performed on 345 pediatric KD patients admitted between June 2018 and June 2022. Patients were randomly divided into training (n = 241) and validation (n = 104) sets. Univariate analysis identified candidate predictors, and Least Absolute Shrinkage and Selection Operator (LASSO) regression was used for feature selection. Multivariable logistic regression and machine learning models—random forest (RF), support vector machine, and k-nearest neighbors—were developed. Model performance was assessed using the area under the receiver operating characteristic curve (AUC), calibration curves, and decision curve analysis. A nomogram was constructed, and SHapley Additive exPlanations (SHAP) values were applied to interpret feature contributions. Seven biomarkers were significantly associated with CAL in univariate analysis (P < 0.05). LASSO and multivariable logistic regression analysis identified age, N-terminal pro-B-type natriuretic peptide, interleukin-6, calprotectin, endothelial microparticles, Matrix Metalloproteinase-9, and Galectin-3 as independent predictors. The RF model demonstrated superior performance, with AUCs of 0.888 (training) and 0.860 (validation). SHAP analysis confirmed these three variables as the top contributors to CAL prediction. The nomogram exhibited strong calibration and clinical utility. The machine learning-based prediction model incorporating novel biomarkers enables individualized risk assessment for CAL development in KD patients. This model exhibits excellent predictive performance and clinical applicability, facilitating early identification of high-risk patients and the implementation of targeted interventions, thereby optimizing healthcare resource allocation and improving long-term cardiovascular outcomes.
Wang et al. (Thu,) studied this question.