XGBoost machine learning algorithm achieved approximately 91.2% accuracy and 90.8% F1 score in predicting coronary heart disease, outperforming Random Forest and Logistic Regression.
Does the XGBoost machine learning algorithm improve the prediction of coronary heart disease compared to Random Forest and Logistic Regression?
The XGBoost machine learning algorithm provides a highly predictive tool capable of identifying subtle risk patterns for coronary heart disease from clinical data.
Estimación del efecto: Accuracy difference ≈3.4% favoring XGBoost
Tasa de eventos absoluta: 91.2% vs 87.8%
Cardiopathy is one of the most serious diseases worldwide with its high morbidity and mortality rates posing a latent risk over time. The objective of this research focuses on evaluating Machine Learning (ML) models such as Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Logistic Regression (LR) for the prediction of coronary heart disease (CHD), with the aim of identifying the most efficient model for this prediction. The model construction followed the Cross-Industry Standard Process for Data Mining (CRISP-DM) methodology, which comprises five stages: business understanding, data understanding, data preparation, modeling, and evaluation. The modeling results revealed the superior predictive capability of the XGBoost algorithm for detecting coronary heart disease, compared to Random Forest and Logistic Regression. The assessment of performance metrics (Accuracy, Precision, Sensitivity, and F1 Score) established XGBoost as the reference model, highlighting an F1 Score of approximately 90.8%. This superiority is attributed to its robustness in capturing nonlinear interactions among clinical variables. Consequently, the XGBoost model is selected as the optimal tool for integration into future medical decision support systems. In summary, this ML-based approach provides a highly predictive tool capable of identifying subtle risk patterns from real clinical data. The XGBoost model is a promising candidate for integration into decision support systems and for the optimization of primary prevention protocols for coronary heart disease.
Paucar et al. (Thu,) conducted a other in Adults with clinical and demographic features for coronary heart disease prediction (n=10,000). XGBoost machine learning algorithm vs. Random Forest and Logistic Regression machine learning algorithms was evaluated on Prediction accuracy of coronary heart disease status (Accuracy difference ≈3.4% favoring XGBoost). XGBoost machine learning algorithm achieved approximately 91.2% accuracy and 90.8% F1 score in predicting coronary heart disease, outperforming Random Forest and Logistic Regression.