Soybean seed types must be classified accurately and interpretably in order to support breeding initiatives and agricultural decision-making. The Hybrid Attention-based Explainable CNN framework proposed in this paper combines handmade multi-trait descriptors with deep CNN features to improve classification performance and transparency. Using common criteria, four model configurations MobileNetV3, ConvNeXt, hybrid feature concatenation without attention, and hybrid attention with CBAM were assessed across five different types of soybean seeds. Although baseline CNN models struggled with visually identical classes, they were able to reach 90–94% accuracy. Multi-trait feature integration increased accuracy to 95%, but the attention-based hybrid model performed best with greater class separability of 98%. In order to demonstrate the efficacy of combining deep learning, handmade features, and attention for robust and understandable seed variety categorization, Grad-CAM and attention visualizations validated the model’s focus on physiologically relevant qualities.
Adugna et al. (Tue,) studied this question.