Machine learning models showed dataset-dependent performance for ischemic heart disease prediction, with logistic regression achieving the highest AUROC (0.7234) on the Framingham dataset.
Machine learning models demonstrate strong, dataset-dependent performance for predicting ischemic heart disease risk, with SHAP providing clinical interpretability of key predictors.
Introduction: Cardiovascular disease (CVD) remains the leading cause of mortality worldwide, with coronary artery disease (CAD), also known as ischemic heart disease (IHD), responsible for approximately 13% of global deaths in 2021. Studies applying machine learning (ML) and deep learning (DL) to heart disease classification have demonstrated promising results in risk prediction and feature extraction. Background/Objectives: In this study, we develop an AI/ML framework to predict and classify ischemic heart disease risk using publicly available datasets, the Framingham Heart Study and the Cleveland subset of the UCI Heart Disease dataset, along with explanations for how predictions were made by a process called SHAP (SHapley Additive exPlanations). Methods: We implemented a leakage-controlled machine learning pipeline that included data cleaning, stratified 80/20 train-test splitting, training-fold-only feature scaling and class balancing, 5-fold hyperparameter tuning, SHAP interpretability, and Brier score-based calibration assessment. Logistic regression, random forest, K-nearest neighbors, XGBoost, and a feedforward neural network were evaluated on the Framingham dataset and the Cleveland subset of the UCI Heart Disease dataset. Performance was assessed using accuracy, precision, recall, F1-score, Matthews correlation coefficient, AUROC, and Brier score. Results: After leakage-controlled evaluation, Framingham performance was more modest than in the preliminary analysis. Logistic regression achieved the highest AUROC on the Framingham dataset (0.7234), while random forest achieved the lowest Brier score (0.1750), and the feedforward neural network achieved the highest accuracy (0.7719). On the Cleveland subset, logistic regression achieved the strongest threshold-based performance (accuracy 0.8667, precision 0.8571, recall 0.8571, F1-score 0.8571, MCC 0.7321), whereas K-nearest neighbors achieved the highest AUROC (0.9531) and lowest Brier score (0.0942). SHAP highlighted systolic blood pressure, smoking status, and hypertension as influential predictors (Framingham) and number of major vessels, chest pain type, thallium stress-test result (thal; normal, fixed defect, or reversible defect), and age (Cleveland) as top predictors. Conclusions: Optimal model performance is dataset-dependent, and SHAP enhances clinical interpretability. Broader access to high-quality, de-identified medical data could accelerate reproducible ML research in cardiology.
Raman et al. (Mon,) conducted a other in Ischemic heart disease. Machine learning models was evaluated on Model performance (AUROC, accuracy, Brier score). Machine learning models showed dataset-dependent performance for ischemic heart disease prediction, with logistic regression achieving the highest AUROC (0.7234) on the Framingham dataset.