3D Semantic Scene Completion (SSC) aims to infer voxel-level occupancy and semantics from partial 2D observations. However, existing methods often rely on global attention or uniform voxel modeling, which may cause semantic interference across unrelated regions and degrade performance under occlusion. To address this, we propose HGroupScene, a unified framework that integrates spatial priors and region-constrained reasoning for robust SSC. HGroupScene introduces: (1) a Hierarchical Grouping Module that partitions the voxel space into subregions and performs semantic aggregation via differentiable Gumbel-Softmax attention; (2) a dual-branch architecture composed of an Explicit Constraint Branch for extracting region-level structural features and an Implicit Diffusion Branch for fine-grained semantic reasoning; and (3) a Region-Constrained Feature Diffusion Mechanism that enables controllable feature propagation under the guidance of structural region priors. Extensive experiments on SemanticKITTI and SSCBench-KITTI360 under both single-frame and multi-frame settings demonstrate that HGroupScene achieves competitive or superior performance compared to state-of-the-art methods, validating the effectiveness of spatially structured semantic modeling.
Wang et al. (Thu,) studied this question.