What question did this study set out to answer?

This study examines how tree species composition, stand density, and foliage condition affect single-tree segmentation accuracy using deep learning.

January 17, 2026Open Access

Single-tree Delineation by Instance Segmentation Using Drone-based Lidar and Multispectral Imagery: a Comparative Study in Various Forest Structures

Key Points

This study examines how tree species composition, stand density, and foliage condition affect single-tree segmentation accuracy using deep learning.
Utilized UAV multispectral imagery and lidar canopy height models (CHM) for assessment.
Collected high-resolution multispectral data and lidar data over several areas of forest.
Trained various deep learning models with multispectral and CHM-derived images for segmentation accuracy analysis.
Conducted drone flights at 80 m altitude with more than 50% lateral overlap for better data density.
Achieved average F1 scores of 70% across study areas, with scores ranging from 36% to 100%.
Highest F1 score of 82% in Bavarian Forest, surpassing baseline methods by up to 39%.
Black Forest reached an F1 score of 85% in simpler plots, but over 50% lower in complex deciduous areas.
Coniferous areas had an average accuracy of 81%, significantly higher than in deciduous stands.

Abstract

Abstract Deep learning methods such as Mask R‑CNN enable the precise delineation of single-tree crowns from remote sensing data. However, their segmentation performance still depends on local stand conditions. Using UAV multispectral imagery and lidar canopy height models (CHM), we assessed the influence of tree species composition, stand density, and foliage condition on the robustness of deep-learning-based single-tree segmentation. High-resolution laser data and multispectral data were collected over several hectares of forest area (Bavarian Forest National Park; DBU Natural Heritage, Schönau Foundation; Black Forest National Park, Kinzigtal) using a DJI 600 Pro drone. The Fraunhofer Lightweight Airborne Profiler collected a multispectral point cloud using a 905-nm laser and two integrated RGB cameras with 4112 × 3008 pixels. Another multispectral camera captured RGB imagery with 4112 × 3008 pixels and two monochrome bands (725 nm RE, 850 nm NIR; 2164 × 2056 pixels each). Flights were conducted at 80 m altitude with ≥ 50% lateral overlap, resulting in an average point density of 150 points/m 2 . Different models were trained and validated using multispectral images (RGB, CIR), images derived from the CHM, and images fused from the CHM and two near-infrared channels (RE, NIR). Highly accurate tree positions and manually processed tree segments were available for accuracy analyses. When the best-performing CHM channel combination was used, the average F1 scores across the three study areas were 70% (range: 36–100%). In the Bavarian Forest, the highest F1 score was 82%, surpassing that obtained with baseline methods by up to 39%. In the Black Forest, the highest F1 score was 85%, but it was > 50% lower in complex deciduous plots. The gains in the DBU Schönau Foundation area were smaller, with F1 scores up to 90%, about 20% above baseline. Multispectral channel combinations, such as CIR or CHM with two IR bands (725 nm, 850 nm), contributed only marginally to tree segmentation. The accuracy in coniferous areas reached 81%, which was about 20% higher than in deciduous stands, although this was influenced by the high stem densities (1000 stems per hectare). In a single reference plot, the results improved substantially under leaf-off conditions. Polygons from delineated tree crowns were of significantly better quality than those from baseline methods. Overall, this study demonstrated the superiority of deep-learning-based tree segmentation in complex, dense forest structures.

Single-tree Delineation by Instance Segmentation Using Drone-based Lidar and Multispectral Imagery: a Comparative Study in Various Forest Structures

Key Points

Abstract

Cite This Study

Also Consider

Also Consider