With Transformers achieving breakthrough results in natural language processing and computer vision, researchers have attempted to leverage their powerful modeling capabilities in 3D point cloud processing. However, the inherent unordered and unstructured nature of point cloud data poses significant challenges to directly applying Transformer architectures. This research proposes a novel point cloud processing method by introducing point cloud serialization and a serialization-based attention mechanism to enhance the performance of the PointNeXt model in semantic segmentation tasks. Traditional point cloud processing methods typically treat point clouds as unstructured data collections, resulting in low computational efficiency and scalability limitations. Our proposed approach breaks through the constraints of point cloud data’s unordered nature by serializing point clouds into a structured format. We employ spatial filling curves (such as Z-order and Hilbert curves) to sort point clouds, enabling efficient grouping of points into non-overlapping patches and applying more efficient attention mechanisms on these patches. Based on the serialization point cloud, we incorporate the segment attention mechanism from Point Transformer V3 (PTv3), which leverages the ordered characteristics of Serialization. By designing segment interactions (such as sequential shifting and sequential random mixing), we expand the model’s receptive field while maintaining computational efficiency.
Teng et al. (Mon,) studied this question.