What question did this study set out to answer?

The aim is to develop a computer-vision pipeline for rapid varietal identification of grapevines in complex vineyard environments.

April 22, 2026Open Access

vinum-Analytics

Key Points

The aim is to develop a computer-vision pipeline for rapid varietal identification of grapevines in complex vineyard environments.
Employing a single-class YOLO11 detector to identify vine leaves and generate standardized image crops.
Training a YOLO11 classifier on leaf regions for eight grapevine varieties based on annotated images.
Evaluating various input settings to assess classifier performance and optimizing for deployment.
Achieved 94.06% Top-1 accuracy on a 303-image test set using the best classifier.
Detection errors accounted for 14.6%, highlighting operational challenges in identification.
Demonstrated stable performance of YOLO11s with low variability across multiple training iterations.

Abstract

Old-vine vineyards often contain dozens of grapevine varieties intermingled and irregularly distributed, making plant-level varietal identification slow and expensive when based on ampelography or molecular approaches. This paper proposes a field-oriented computer-vision pipeline for Vitis vinifera variety identification using images with a natural background from the historic “Vinha Maria Teresa” parcel (Quinta do Crasto, Portugal). A single-class YOLO11 detector is trained to localize the vine leaf and generate standardized crops, and a YOLO11 classifier is then fine-tuned on leaf regions of interest (ROIs) for eight selected varieties in the Douro UNESCO region. We annotated 2015 vineyard images for classification and supplemented detection training with 2648 additional leaf images; detectors (YOLO11n/s/m) were benchmarked under four augmentation regimes and evaluated on a fixed 48-image subset, including runtime on CPU and GPU. The best detector reached mAP@50–95 of 0.918 on the benchmark, while YOLO11n achieved ∼27 FPS on CPU for fast cropping. On a 303-image test set, the best classifier (YOLO11s with mixed augmentations) achieved 94.06% Top-1 accuracy, 93.92% macro-F1, and 100% Top-5 accuracy with remaining errors concentrated among morphologically similar varieties. To assess deployment-oriented performance, classifiers trained under three input settings (manual crops, detector-generated crops, and full images) were evaluated on a held-out 48-image benchmark subset; removing the detection step reduced Top-1 accuracy from 75.00% to 68.75%, while the gap between manual and automatic crops was only 2.44 pp on successfully detected images with detection failures (14.6%) representing the primary operational bottleneck. Repeated retraining of the best manual-crop YOLO11s configuration across multiple random seeds showed stable performance with low variability in Top-1 accuracy and macro-F1. Under identical training conditions, ResNet50 and EfficientNet-B0 provided competitive baselines, but YOLO11s remained the strongest overall model on the held-out field benchmark. These results indicate that lightweight leaf detection plus crop-based classification can support scalable varietal identification in old vineyards under realistic acquisition conditions.

vinum-Analytics

Key Points

Abstract

Cite This Study