Heart disease remains the leading cause of mortality worldwide. Early diagnosis is essential, yet traditional machine learning-based diagnostic systems often rely heavily on large amounts of expert-labeled data, making them costly and time-consuming to develop. To address this challenge, we propose a heterogeneous committee-based adaptive active learning (HC-AAL) framework for heart disease prediction. The method integrates multiple machine learning techniques, including XGBoost, Random Forest, and Logistic Regression, into a diverse committee that iteratively selects the most informative samples using a committee-weighted query-by-committee (CW-QBC) uncertainty measure. To enhance diversity and representativeness, K-Means clustering is applied to uncertain instances, and adaptive query control is used to dynamically adjust the number of queried samples per iteration. In particular, the adaptive query length mechanism dynamically adjusts the number of samples queried at each iteration to balance learning efficiency and computational cost. Experimental results demonstrate that the proposed HC-AAL framework achieves 96.04% accuracy using only 31.9% of labeled data, significantly outperforming traditional fully supervised approaches. These findings highlight the effectiveness of adaptive active learning strategies in reducing annotation costs while maintaining high predictive performance.
Goli et al. (Thu,) studied this question.