Old-vine vineyards often contain dozens of grapevine varieties intermingled and irregularly distributed, making plant-level varietal identification slow and expensive when based on ampelography or molecular approaches. This paper proposes a field-oriented computer-vision pipeline for Vitis vinifera variety identification using images with a natural background from the historic “Vinha Maria Teresa” parcel (Quinta do Crasto, Portugal). A single-class YOLO11 detector is trained to localize the vine leaf and generate standardized crops, and a YOLO11 classifier is then fine-tuned on leaf regions of interest (ROIs) for eight selected varieties in the Douro UNESCO region. We annotated 2015 vineyard images for classification and supplemented detection training with 2648 additional leaf images; detectors (YOLO11n/s/m) were benchmarked under four augmentation regimes and evaluated on a fixed 48-image subset, including runtime on CPU and GPU. The best detector reached mAP@50–95 of 0.918 on the benchmark, while YOLO11n achieved ∼27 FPS on CPU for fast cropping. On a 303-image test set, the best classifier (YOLO11s with mixed augmentations) achieved 94.06% Top-1 accuracy, 93.92% macro-F1, and 100% Top-5 accuracy with remaining errors concentrated among morphologically similar varieties. To assess deployment-oriented performance, classifiers trained under three input settings (manual crops, detector-generated crops, and full images) were evaluated on a held-out 48-image benchmark subset; removing the detection step reduced Top-1 accuracy from 75.00% to 68.75%, while the gap between manual and automatic crops was only 2.44 pp on successfully detected images with detection failures (14.6%) representing the primary operational bottleneck. Repeated retraining of the best manual-crop YOLO11s configuration across multiple random seeds showed stable performance with low variability in Top-1 accuracy and macro-F1. Under identical training conditions, ResNet50 and EfficientNet-B0 provided competitive baselines, but YOLO11s remained the strongest overall model on the held-out field benchmark. These results indicate that lightweight leaf detection plus crop-based classification can support scalable varietal identification in old vineyards under realistic acquisition conditions.
Ferreira et al. (Sat,) studied this question.