The combination of artificial intelligence (AI) and computer vision is transforming agriculture by offering information that helps to optimize crop and resource management and, thus, boost productivity. The early warning on the occurrence of plant stress due to various environmental factors is through AI, and it helps reduce losses in yield. Thermal imaging is used to record temperature difference in vegetation and identify different physiological stresses, which can be used to prevent serious damages by agronomists before they take place. The study examines the deep learning algorithms to detect plant stress by detecting thermal images of paddy leaf blades. Although convolutional neural networks (CNNs) have long been the most popular in the image classification task, Vision Transformers (ViTs) are proving to be an effective alternative. This study assesses various CNN models, including VGG-16, VGG-19, ResNet-50, and Xception, as well as the Global Context Vision Transformer (GCViT). The findings reveal that ResNet and GCViT models outperform other models, demonstrating higher accuracy with variation of around 3% accuracy across various dataset folds and classes. This work introduces a comparative analysis of several CNN structures (VGG-16, VGG-19, ResNet-50, Xception) and the GCViT. ResNet and GCViT will be shown as the most accurate (0.93−0.97) and stable in various dataset folds and classes. Also, explainable AI is applied to the model to identify the part of the image that decides the stress class.
Patil et al. (Sat,) studied this question.