What question did this study set out to answer?

The aim is to improve the accuracy of coffee leaf rust severity estimation using advanced segmentation methods.

May 26, 2026Open Access

Foundation Model‐Assisted Segmentation Enables Robust Field‐Based Severity Estimation: A Case‐Study of Coffee Leaf Rust

Key Points

The aim is to improve the accuracy of coffee leaf rust severity estimation using advanced segmentation methods.
Developed a segmentation pipeline using SAM3 and SAM2 models for disease estimation.
Collected 1285 field images with 606 pixel-level lesion masks for model evaluation.
Implemented a two-stage detection-segmentation workflow integrating YOLOv8 and SAM2.
SAM3 achieved the highest segmentation performance with Dice ≈0.91 and IoU ≈0.83.
DeepLabV3+ produced a Dice score of ≈0.86 and IoU of ≈0.75.
Classical methods had high recall but significantly lower precision, leading to overestimated disease severity.

Abstract

ABSTRACT Accurate estimation of coffee leaf rust ( Hemileia vastatrix ) severity remains challenging due to symptom heterogeneity, illumination variability and observer subjectivity, particularly under field conditions. We developed a segmentation pipeline combining zero‐shot foundational models (SAM3) and fine‐tuned deep learning (SAM2) for estimation of percentage severity and compared them with three other segmentation approaches: DeepLabV3+ (convolutional neural network model), ImageJ (colour thresholding) and the pliman R package (palette‐based). A total of 1285 images were collected across several field plots, and 606 pixel‐level rust lesion masks were curated for model fine‐tuning and independent evaluation. A two‐stage detection–segmentation workflow integrating YOLOv8 and SAM2 enabled robust leaf extraction from complex branch‐level images, with detection performance consistently exceeding 0.98 across metrics. For lesion segmentation, at the pixel level, the zero‐shot foundation model SAM3 achieved the highest disease segmentation performance (Dice ≈0.91; IoU ≈0.83), with balanced precision (≈0.93) and recall (≈0.94). DeepLabV3+ also performed well (Dice ≈0.86; IoU ≈0.75), whereas classical threshold‐based approaches showed high recall but substantially lower precision, indicating frequent false positives and producing a systematic positive bias. SAM2 exhibited more conservative performance, yielding higher precision but reduced recall relative to SAM3. At the leaf level, the highest agreement between predicted and reference severity estimates was found for SAM3, followed by DeepLabV3+ and SAM2, while classical methods overestimated diseased area. Overall, SAM3 led to a reliable, scalable and consistent severity estimation under real‐world field conditions.

Foundation Model‐Assisted Segmentation Enables Robust Field‐Based Severity Estimation: A Case‐Study of Coffee Leaf Rust

Key Points

Abstract

Cite This Study