Due to the ability to perceive fine-grained 3D scenes and recognize objects of arbitrary shapes, 3D occupancy prediction plays a crucial role in vision-centric autonomous driving and robotics. However, most existing methods rely on voxel-based methods, which inevitably demand a large amount of memory and computing resources. To address this challenge and facilitate more efficient 3D occupancy prediction, we propose HBEVOcc, a Bird’s-Eye-View based method for 3D scene representation with a novel height-aware deformable attention module, which can effectively leverage latent height information within BEV framework to compensate for lack of height dimension, significantly reducing computing resource consumption while enhancing the performance. Specifically, our method first extracts multi-camera image features and lifts these 2D features into 3D BEV occupancy features via explicit and implicit view transformations. The BEV features are then further processed by a BEV feature extraction network and height-aware deformable attention module, with the final 3D occupancy prediction results obtained through a prediction head. To further enhance voxel supervision along the height axis, we introduce a height-aware voxel loss with adaptive vertical weighting. Extensive experiments on the Occ3D-nuScenes and OpenOcc dataset demonstrate that HBEVOcc can achieve state-of-the-art results in terms of both mIoU and RayIoU metrics with less training memory (even when trained on 2080Ti).
Building similarity graph...
Analyzing shared references across papers
Loading...
Chuandong Lyu
Wenkai Li
Iman Yi Liao
Building similarity graph...
Analyzing shared references across papers
Loading...
Lyu et al. (Sun,) studied this question.
www.synapsesocial.com/papers/69810013c1c9540dea8132e8 — DOI: https://doi.org/10.3390/s26030934
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: