A single non-destructive testing technology cannot fully capture mangoes' quality characteristics. To accurately predict the soluble solids content (SSC), a key quality indicator of mangoes, this study proposes a multimodal detection method integrating hyperspectral imaging, electronic nose, and computer vision technology. A deep learning-based GAT-Net model was developed, which extracts local spatiotemporal features through convolutional layers, captures the global correlation of multi-source data using the attention mechanism, and realizes adaptive weighted integration of cross-modal information with the help of gated fusion layers. Experimental results demonstrate that the model's prediction performance progressively improves as data modalities increase. After incorporating the attention and gating mechanisms, the model exhibited the best performance, with R² reaching 0.9798 and RPD increasing to 11.5586. In summary, the GAT-Net framework proposed in this study effectively integrates multimodal data to predict the soluble solid content of mangoes, offering a novel approach for non-destructive testing of post-harvest fruit quality. • A GAT-Net model was developed to predict the soluble solids content of mango. • Spectral, electronic nose and image data can be effectively fused. • As the data modality increases, the model performance gradually improves. • Attention and gating mechanisms enable the model to achieve the best results.
Huang et al. (Wed,) studied this question.