What question did this study set out to answer?

This study aims to improve semantic segmentation of airborne LiDAR point clouds using a deep learning approach with a transformer-enhanced architecture.

May 4, 2026Open Access

Deep Learning-Based Semantic Segmentation of Airborne LiDAR Point Clouds Using a Transformer-Enhanced PointNet++ Architecture

Key Points

This study aims to improve semantic segmentation of airborne LiDAR point clouds using a deep learning approach with a transformer-enhanced architecture.
Utilized Oregon LiDAR Program dataset for multi-class semantic segmentation focusing on four classes.
Data resampled to 4096 points, incorporating X, Y, Z coordinates, RGB, and intensity features.
Compared the proposed PointNet++ MSG Transformer model with baseline and state-of-the-art models using various training configurations.
Achieved a mean Intersection over Union (IoU) of 51.74% on the test dataset.
Attained an accuracy of 61.50%, demonstrating the effectiveness of the model.
Performance evaluation showed advantages of multi-scale feature extraction and transformer-based fusion.

Abstract

Airborne LiDAR (Light Detection and Ranging) data is widely used in urban modelling and three-dimensional spatial analysis studies. However, the irregular structure of LiDAR point clouds, varying point densities, and class imbalances observed in the datasets make semantic segmentation problematic. This study addresses the four-class semantic segmentation problem (unclassified, vegetation, ground, and building) on aerial LiDAR point clouds, with a particular focus on multi-class segmentation. The Oregon LiDAR Program dataset was obtained through the OpenTopography platform for use in this study. The point cloud data were resampled to 4096 points to ensure a fixed input size; for each point, the X, Y, and Z coordinates, along with the RGB and intensity features, were utilized. Experimental studies compared the proposed method with both baseline models (PointNet, PointNet++ MSG, and VoxelNet Lite) and recent state-of-the-art architectures, including Point Transformer, KPConv, and RandLA-Net. Additionally, the PointNet2 MSG Transformer model was developed based on the PointNet++ MSG architecture and includes a transformer-based feature fusion module. Different loss functions and training configurations were evaluated, and the effects of ensemble learning and test-time augmentation strategies on model performance were analyzed. The experimental results show that the proposed approach achieved a mean Intersection over Union (IoU) of 51.74% and an accuracy of 61.50% on the test dataset. These results demonstrate that combining multi-scale feature extraction with transformer-based feature fusion is an effective approach for semantic segmentation of LiDAR point clouds and multi-class segmentation tasks.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Sevinç et al. (Wed,) studied this question.

synapsesocial.com/papers/69f837ab3ed186a739981e7f https://doi.org/https://doi.org/10.3390/geomatics6030043

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Bookmark

View Full Paper