What question did this study set out to answer?

May 15, 2026Open Access

Per-SAM-MCPA: A Lightweight Framework for Individual Tree Crown Segmentation from UAV Imagery

Key Points

This study aims to develop a lightweight framework for accurate individual tree crown segmentation from UAV imagery, addressing challenges in dense plantation environments.
Proposed framework integrates perceptual target-aware representation and multi-scale detail enhancement.
Utilizes a composite constraint loss combining various loss types to improve segmentation quality.
Evaluated on the Catalpa bungei UAV dataset, achieving significant performance metrics.
Achieved an intersection over union (IoU) of 87.3% and an F1-score of 91.0%.
Performed better than baseline methods like SAM and Mask R-CNN.
Maintained an inference speed of 35.7 FPS on a single GPU.

Abstract

Accurate individual tree crown (ITC) segmentation from unmanned aerial vehicle (UAV) imagery is important for fine-scale forest inventory, plantation management, and ecological monitoring. However, delineating ITCs in dense plantation environments remains difficult because crowns are strongly adjacent, canopy structures are highly homogeneous, and crown boundaries are often blurred, making it hard for existing methods to preserve both regional integrity and boundary continuity. This study proposes the Perceptual Segment-Anything Model with Multi-head Cross-Parallel Attention (Per-SAM-MCPA), a lightweight and effective framework for fine-grained ITC segmentation in dense plantation scenes. Based on a compact ResNet-50 backbone, the framework integrates perceptual target-aware representation, multi-scale detail enhancement, global contextual modeling, and semantic-boundary collaborative refinement to improve crown discrimination and structural consistency. A perceptual relation module is used to strengthen pixel-level semantic dependency modeling, and a Multi-head Cross-Parallel Attention (MCPA) mechanism is designed to capture long-range contextual interactions through orthogonally decomposed spatial attention, improving global geometric consistency with limited computational overhead. A Composite Constraint Loss (CCL) that combines a weighted cross-entropy loss, a structural similarity loss, and a boundary term based on Hausdorff distance is introduced to jointly optimize region-level segmentation quality and boundary fidelity. Experiments on the Catalpa bungei UAV dataset show that the proposed method achieves an intersection over union (IoU) of 87.3% and an F1-score of 91.0%, outperforming representative baseline methods such as SAM and Mask R-CNN while maintaining an inference speed of 35.7 FPS on a single GPU. These results indicate that Per-SAM-MCPA offers an accurate, efficient, and practical solution for ITC segmentation in dense plantation environments.

Read Full Paperexternally

Mark Helpful

Bookmark

Relay

View Full Paper