Recent advances in deep learning have positioned object detection as a critical technology across various domains, including autonomous driving, video surveillance, and robotics. Despite their high accuracy, object detection models rely heavily on large, meticulously annotated datasets. However, annotating data for object detection is significantly more time-consuming and expensive than for standard image classification tasks, creating substantial barriers in both research and industry. To address this challenge, we propose an adaptive active learning framework aimed at reducing annotation costs without compromising model performance. Although active learning is known for its ability to minimize labeling efforts, applying it to object detection presents unique challenges owing to the need to account for both uncertainty and diversity. Our approach estimates uncertainty using object confidence scores and quantifies diversity based on the number of classes per image across the unlabeled dataset. Moreover, our framework dynamically adjusts the weighting between uncertainty and diversity throughout training. Experiments on a Unmanned Aerial Vehicle (UAV) dataset and a real-world industrial dataset involving high-voltage electrical cables demonstrated performance improvements of 1.1% and approximately 10%, respectively, under the same annotation budget. These results demonstrate the potential of our framework to significantly lower annotation costs while maintaining high detection performance, rendering it well-suited for real-world industrial applications.
Auh et al. (Thu,) studied this question.