Accurate classification of urban tree species is fundamental for urban green space management and ecological assessment. To address the challenges of small and overlapping tree crown detection in high-resolution remote sensing imagery, this study proposes YOLO-CNGD, a novel framework based on YOLOv11n. The key enhancements include the integration of the Convolutional Block Attention Module (CBAM) for refined feature representation, the adoption of the Normalized Wasserstein Distance (NWD) loss for robust small-object localization, the incorporation of Deformable Convolution v3 (DCNv3) to adapt to irregular shapes, and the replacement of standard convolutions with GhostConv for a lightweight design. Experiments on a self-built urban tree dataset show that YOLO-CNGD achieves a precision of 94.8%, a recall of 91.1%, and an mAP@0.5 of 93.7%. The model balances accuracy and efficiency, showing great potential for large-scale automated urban tree inventory.
Zhang et al. (Wed,) studied this question.