In response to the challenges posed by autonomous operation of jujube harvesters in agricultural settings and the recognition of jujube images characterised by high noise levels and a large number of samples, this paper proposes a multi-module deep learning framework for the accurate recognition of jujubes in complex agricultural environments. The model integrates a Multi-layer Background Smoothing (MBS) module, Joint Cross-Channel Attention (JCCA) and Joint Cross-Spatial Attention (JCSA). This approach effectively removes background noise, enhances details and improves the recognition accuracy of jujubes. In experiments across four datasets, the model demonstrated excellent performance, particularly in complex environments involving occlusions, background clutter and noise, the model achieved an accuracy (ACC) of 92.23%, a recall of 93.08% and an F1 score of 91.46%. Compared with existing recognition models, the model proposed in this paper demonstrates greater stability and adaptability under conditions involving diverse samples, varied backgrounds and high noise levels, and holds significant potential for widespread application in agricultural intelligence.The code is available at: https://github.com/mxz1h/MSJNet/tree/main .
Ding et al. (Fri,) studied this question.