Human activity recognition (HAR) from wearable sensors increasingly faces a dual bottleneck: obtaining labels is expensive, and the labeled subset is often class-imbalanced and redundant. We address this problem with a budget-aware class-balanced active learning framework, termed CCUR-M, that closes the loop between adaptive class balancing, hybrid batch querying, and lightweight retraining. At each round, the labeled subset is rebalanced toward a median target class size through cluster-preserving majority undersampling and minority-class conditional synthesis, after which a hybrid query score combines minimum-confidence uncertainty with cluster-centered representativeness under a round-dependent budget weight. An XGBoost classifier is retrained on the rebalanced set, and the procedure is iterated until the annotation budget is exhausted. We evaluate the method on three public wearable HAR benchmarks with different difficulty profiles: PAMAP2, OPPORTUNITY, and USC-HAD. CCUR-M achieves the best final Macro-F1 on all three datasets, reaching 0.9574, 0.6780, and 0.6128, respectively. The largest final and average gains over the strongest baseline occur on OPPORTUNITY (+0.1205 final, +0.0629 average), while USC-HAD reveals a later-stage rather than early-stage advantage. Ablation experiments show that no single module explains the overall gain; instead, balancing, uncertainty, and representativeness act synergistically, with the full loop outperforming the base variant by +0.1243, +0.1638, and +0.2143 on PAMAP2, OPPORTUNITY, and USC-HAD. These results support a mathematically interpretable view of active learning for imbalanced wearable time series: the key benefit arises from coupling distribution correction and query design within the same budgeted training loop.
Liu et al. (Tue,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: