What type of study is this?

This is a Experimental Study study.

September 20, 2025Open Access

CrossAlignNet: a self-supervised feature learning framework for 3D point cloud understanding

Key Points

The proposed framework achieves enhanced semantic consistency through self-supervised learning techniques.
Point cloud representation learning shows improvements in classification and segmentation tasks, achieving superior results.
A dual-task learning approach integrates local geometric structures with global semantic information for better feature extraction.
The creation of the ShapeNet3D-CMA dataset supports accurate mapping and strengthens cross-modal learning applications.

Abstract

We propose a self-supervised point cloud representation learning framework CrossAlignNet based on cross-modal mask alignment strategy, to solve the problems of imbalance between global semantic and local geometric feature learning, as well as cross-modal information asymmetry in existing methods. A geometrically consistent mask region is established between the point cloud patches and the corresponding image patches through a synchronized mask alignment strategy to ensure cross-modal information symmetry. A dual-task learning framework is designed: the global semantic alignment task enhances the cross-modal semantic consistency through contrastive learning, and the local mask reconstruction task fuses the image cues using the cross-attention mechanism to recover the local geometric structure of the masked point cloud. In addition, the ShapeNet3D-CMA dataset is constructed to provide accurate point cloud-image spatial mapping relations to support cross-modal learning. Our framework shows superior or comparative results against existing methods on three point cloud understanding tasks including object classification, few-shot classification, and part segmentation.

Read Full Paperexternally

AIに質問

Bookmark

View Full Paper