What question did this study set out to answer?

This analysis aims to compare machine learning models for predicting heart disease using a robust dataset.

May 31, 2026

A fair comparison and association rule analysis in machine learning to predict heart disease

Key Points

This analysis aims to compare machine learning models for predicting heart disease using a robust dataset.
Retrospective secondary analysis of a multi-country dataset (N=918) from UCI Machine Learning Repository.
Developed a C4.5-based decision tree model with information gain pruning, validated through 5-fold cross-validation and leave-one-country-out strategy.
Evaluated model performance using accuracy, precision, recall, F1-score, AUC, Brier score, and computational metrics.
Decision tree achieved accuracy of 0.8366 ± 0.0329, F1-score of 0.8319 ± 0.0344, and AUC of 0.8981 ± 0.0277 in internal validation.
External validation showed performance variability across countries, indicating sensitivity to distribution shifts.
Model demonstrated low computational cost with a training time of 0.0028 ± 0.0015 seconds.

Abstract

Background: Heart disease prediction is a critical task in clinical decision support, particularly in settings with high physician workloads. Interpretable and computationally efficient models are needed to facilitate transparent and practical implementation in healthcare environments. Methods: This retrospective secondary analysis utilized a publicly available multi-country heart disease dataset ( N = 918) derived from the UCI Machine Learning Repository. The primary outcome was binary heart disease status. A C4.5-based decision tree model with information gain-based pruning was developed using predefined predictors selected by gain ratio. Internal validation was performed using 5-fold cross-validation. External validation was conducted using a leave-one-country-out strategy to assess generalizability across national cohorts. Model performance was evaluated using discrimination metrics (accuracy, precision, recall, F1-score, and Area Under the Curve AUC), calibration (Brier score and calibration plots), and computational complexity (training and inference time). Comparative analyses were conducted against K-Nearest Neighbors (KNN), random forest, and Multilayer Perceptron (MLP) models using consistent parameter settings. Subgroup analyses by age and sex were also performed. Results: In internal validation, the decision tree achieved an accuracy of 0.8366 ± 0.0329 and an F1-score of 0.8319 ± 0.0344, with an AUC of 0.8981 ± 0.0277 and a Brier score of 0.1206 ± 0.0182. The model demonstrated low computational cost (training time: 0.0028 ± 0.0015 seconds). External validation revealed performance variability across countries, indicating sensitivity to distribution shifts. Subgroup analyses showed generally consistent performance across age and sex strata, although instability was observed in data-scarce subgroups. Conclusion: The proposed C4.5-based model provides interpretable rule-based predictions with competitive discrimination, acceptable calibration, and low computational complexity. While performance varies across national cohorts, the model demonstrates potential as a transparent and resource-efficient prototype clinical decision support tool, warranting further prospective validation.

Bookmark

A fair comparison and association rule analysis in machine learning to predict heart disease

Key Points

Abstract

Cite This Study