Instance segmentation of 3D point clouds is a fundamental task for scene understanding in applications such as autonomous driving, robotics, and augmented reality. The inherent irregularity and sparsity of point clouds, compounded by scale variations and instance adhesion, pose significant challenges to accurate segmentation. Existing grouping-based methods are often limited by the loss of geometric details in single-path backbones and by error propagation near complex boundaries. To address these issues, a Multi-grained Dual-aware Grouping algorithm (MDGroup) is proposed, which explicitly integrates multi-grained feature representation with dual awareness of class and boundary. The algorithm features a Dual-Resolution 3D U-Net (DRNet) that preserves local geometric details while aggregating global semantics through adaptive alignment. A four-branch prediction scheme enhances semantic and offset estimation with boundary and directional cues, enabling fine-grained boundary modeling. Furthermore, a Hierarchical Adaptive Multi-grained Feature fusion framework (HAMF) achieves efficient cross-scale alignment by combining Class-Aware Dynamic Voxelization and Class-Aware Pyramid Scaling. Finally, a Boundary-Aware Weighted Aggregation mechanism (BAWA) refines instance grouping by dynamically weighting semantic confidence, geometric distance, boundary probability, and directional consistency. To extend the model to dynamic scenes, a Temporal Adaptive Gating (TAG) module is introduced to leverage historical frame correlations. Extensive experiments on the ScanNet v2, S3DIS, STPLS3D, SemanticKITTI, LiDAR-Net, and OCID benchmarks demonstrate that MDGroup achieves state-of-the-art performance among grouping-based methods, particularly on small objects, complex boundaries, and dynamic environments.
Sun et al. (Tue,) studied this question.