What question did this study set out to answer?

The study aims to enhance human activity recognition by addressing class imbalance and the cost of labeling data.

June 4, 2026Open Access

A Budget-Aware Class-Balanced Active Learning Framework for Imbalanced Wearable Human Activity Recognition

Key Points

The study aims to enhance human activity recognition by addressing class imbalance and the cost of labeling data.
Developed a budget-aware active learning framework, CCUR-M, integrating class balancing and query strategies.
Utilized cluster-preserving undersampling and minority-class conditional synthesis to achieve balanced class representation.
Trained an XGBoost classifier on the rebalanced dataset, iterating the process until the budget was exhausted.
Achieved final Macro-F1 scores of 0.9574, 0.6780, and 0.6128 on PAMAP2, OPPORTUNITY, and USC-HAD respectively.
Demonstrated significant gains over baseline models, especially on OPPORTUNITY (+0.1205 final gain).
Showed that no single component accounted for the improvement; rather, the combined approach was most effective.

Abstract

Human activity recognition (HAR) from wearable sensors increasingly faces a dual bottleneck: obtaining labels is expensive, and the labeled subset is often class-imbalanced and redundant. We address this problem with a budget-aware class-balanced active learning framework, termed CCUR-M, that closes the loop between adaptive class balancing, hybrid batch querying, and lightweight retraining. At each round, the labeled subset is rebalanced toward a median target class size through cluster-preserving majority undersampling and minority-class conditional synthesis, after which a hybrid query score combines minimum-confidence uncertainty with cluster-centered representativeness under a round-dependent budget weight. An XGBoost classifier is retrained on the rebalanced set, and the procedure is iterated until the annotation budget is exhausted. We evaluate the method on three public wearable HAR benchmarks with different difficulty profiles: PAMAP2, OPPORTUNITY, and USC-HAD. CCUR-M achieves the best final Macro-F1 on all three datasets, reaching 0.9574, 0.6780, and 0.6128, respectively. The largest final and average gains over the strongest baseline occur on OPPORTUNITY (+0.1205 final, +0.0629 average), while USC-HAD reveals a later-stage rather than early-stage advantage. Ablation experiments show that no single module explains the overall gain; instead, balancing, uncertainty, and representativeness act synergistically, with the full loop outperforming the base variant by +0.1243, +0.1638, and +0.2143 on PAMAP2, OPPORTUNITY, and USC-HAD. These results support a mathematically interpretable view of active learning for imbalanced wearable time series: the key benefit arises from coupling distribution correction and query design within the same budgeted training loop.

A Budget-Aware Class-Balanced Active Learning Framework for Imbalanced Wearable Human Activity Recognition

Key Points

Abstract

Cite This Study

Also Consider

Also Consider