August 15, 2025

Unsupervised 3D Object Detection by Commonsense Clue

Key Points

Unsupervised 3D object detection achieves a 89.25% average precision on moderate cars, surpassing others.
The approach utilizes commonsense prototypes to guide the detection process, enhancing accuracy without human annotations.
CPD++ improves learning by incorporating both stationary and moving objects for better recognition and localization.
This method leverages the waymo open dataset and showcases a promising alternative to traditional supervised models.

Abstract

Traditional 3D object detectors, whether fully-, semi-, or weakly-supervised, rely heavily on extensive human annotations. In contrast, this paper introduces an unsupervised 3D object detector that automatically discerns object patterns without such annotations. To achieve this, we propose a Commonsense Prototype-based Detector (CPD) for unsupervised 3D object detection. CPD first constructs Commonsense Prototypes (CProto) to represent the geometric center and size of objects. It then generates high-quality pseudo-labels and guides detector convergence using size and geometry priors from CProto. Building on CPD, we further introduce CPD++, an enhanced version that improves performance by leveraging motion cues. CPD++ learns localization from stationary objects and recognition from moving objects, facilitating the mutual transfer of localization and recognition knowledge between these two object types. Both CPD and CPD++ outperform existing state-of-the-art unsupervised 3D detectors. Furthermore, when trained on Waymo Open Dataset (WOD) and tested on KITTI, CPD++ achieves 89.25% 3D Average Precision (AP) on the moderate car class at a 0.5 IoU threshold, reaching 95.3% of the performance attained by fully supervised counterparts. These results underscore the significant advancements brought by our method.

Bookmark

Unsupervised 3D Object Detection by Commonsense Clue

Key Points

Abstract

Cite This Study

Also Consider

Also Consider