Early and accurate detection of plant diseases is essential for ensuring food security and maintaining sustainable agricultural productivity. However, most deep learning models for plant disease classification rely heavily on large-scale annotated datasets, which are expensive, labor-intensive, and often impractical to obtain in real-world farming environments. To address this limitation, we propose a unified self-supervised learning (SSL) framework that leverages unlabeled plant imagery to learn meaningful and transferable visual representations. Our method integrates three complementary objectives—Bootstrap Your Own Latent (BYOL), Masked Image Modeling (MIM), and contrastive learning—within a ResNet101 backbone, optimized through a hybrid loss function that captures global alignment, local structure, and instance-level distinction. GPU-based data augmentations are used to introduce stochasticity and enhance generalization during pretraining. Experimental results on the challenging PlantDoc dataset demonstrate that our model achieves an accuracy of 77.82%, with macro-averaged precision, recall, and F1-score of 80.00%, 78.24%, and 77.48%, respectively—on par with or exceeding most state-of-the-art supervised and self-supervised approaches. Furthermore, when fine-tuned on the PlantVillage dataset, the pretrained model attains 99.85% accuracy, highlighting its strong cross-domain generalization and practical transferability. These findings underscore the potential of self-supervised learning as a scalable, annotation-efficient, and robust solution for plant disease detection in real-world agricultural settings, especially where labeled data is scarce or unavailable.
Huan et al. (Wed,) studied this question.