March 3, 2026Open Access

Transformer-Based Multi-Class Classification of Bangladeshi Rice Varieties Using Image Data

Key Points

ViT achieved a remarkable accuracy of 99.86% with precision, recall, and F1-score all at 0.9986, showing superior performance.
Swin Transformer also exhibited strong performance with 99.44% accuracy and high precision and recall metrics, validating its effectiveness.
Assessment using deep learning models like ViT and Swin Transformer on the PRBD dataset indicates the potential for enhanced classification.
The findings highlight the need for further exploration of transformer models in agricultural applications, particularly for rice variety identification.

Abstract

Rice (Oryza sativa L.) is a staple food for over half of the global population, with significant economic, agricultural, and cultural importance, particularly in Asia. Thousands of rice varieties exist worldwide, differing in size, shape, color, and texture, making accurate classification essential for quality control, breeding programs, and authenticity verification in trade and research. Traditional manual identification of rice varieties is time-consuming, error-prone, and heavily reliant on expert knowledge. Deep learning provides an efficient alternative by automatically extracting discriminative features from rice grain images for precise classification. While prior studies have primarily employed deep learning models such as CNN, VGG, InceptionV3, MobileNet, and DenseNet201, transformer-based models remain underexplored for rice variety classification. This study addresses this gap by applying two deep learning models such as Swin Transformer and Vision Transformer for multi-class classification of rice varieties using the publicly available PRBD dataset from Bangladesh. Experimental results demonstrate that the ViT model achieved an accuracy of 99.86% with precision, recall, and F1-score all at 0.9986, while the Swin Transformer model obtained an accuracy of 99.44% with a precision of 0.9944, recall of 0.9944, and F1-score of 0.9943. These results highlight the effectiveness of transformer-based models for high-accuracy rice variety classification.

Bookmark

View Full Paper

Cite This Study

Tabassum et al. (Tue,) studied this question.

synapsesocial.com/papers/69a75a97c6e9836116a209bf https://doi.org/https://doi.org/10.3390/app16031279

Bookmark

View Full Paper