What question did this study set out to answer?

The aim is to address weaknesses in few-shot point cloud segmentation by enhancing feature representation and modeling capabilities.

June 15, 2026Open Access

A few-shot point cloud segmentation network with multi-dimensional perception and temporal-frequency domain enhancement

Key Points

The aim is to address weaknesses in few-shot point cloud segmentation by enhancing feature representation and modeling capabilities.
Developed a Geometry Awareness Module for local geometric representation using multi-scale positional encoding.
Introduced a Global Perception Module for capturing long-range contextual relationships with adaptive pooling techniques.
Implemented a Frequency Domain Module utilizing Fourier transforms for enhanced feature robustness against noise.
Achieved mIoU scores of 50.01% on S3DIS and 42.72% on ScanNet under the 1-way 1-shot setting.
Ablation studies confirmed improved segmentation accuracy and prototype robustness in noisy environments.

Abstract

In response to the limitations of few-shot point cloud semantic segmentation, including weak local geometric representation, insufficient utilization of global contextual information, and vulnerability of features to noise interference, we propose a few-shot point cloud segmentation network that combines multi-dimensional perception and frequency-domain enhancement. First, we establish a multi-dimensional perception mechanism by designing a Geometry Awareness Module (GAM) that models local geometric manifolds through multi-scale positional encoding and explicit neighborhood difference modeling. Second, a Global Perception Module (GPM) is introduced, utilizing multi-scale adaptive pooling to capture long-range contextual dependencies. This enables feature refinement that spans from local fine-grained structures to the global scene context. Finally, we construct a Frequency Domain Module (FDM) that utilizes the Fourier transform to disentangle the amplitude spectrum from the phase spectrum and adopts an adaptive spectral enhancement strategy to suppress unstable feature perturbations and enhance boundary-sensitive responses, thereby alleviating the limitations of purely spatial-domain feature representation and enabling complementary learning between spatial-domain and frequency-domain features. Experimental results show that the proposed method achieves competitive performance on the S3DIS and ScanNet benchmarks, with mIoU scores of 50.01% and 42.72% under the 1-way 1-shot setting, respectively. Ablation experiments verify that the network effectively improves segmentation accuracy and prototype robustness in data-scarce and complex indoor environments by means of explicit joint geometric and frequency-domain modeling together with global semantic consistency regularization.

Demander à l'IA

Bookmark

View Full Paper