Near-infrared (NIR) spectroscopy is a widely used technology in the horticulture industry for non-destructive fruit grading. Partial Least Squares (PLS) regression is the dominant method for producing fruit quality predictions from measured spectra. Alternative deep learning methods have shown promise, but often require large amounts of labelled data to train. This study proposes a semi-supervised method based on Barlow Twins to include unlabelled data in the training process. We adopt the Barlow Twins method by using repeated measurements on the same fruit from different devices as different “views” to encode into the same latent space and combine the encoder network with a regression head for prediction. Our approach demonstrates improved performance over PLS with up to 17% lower RMSE, especially when the labelled data is limited. The Barlow loss function also improves calibration transfer results. • Novel application of Barlow Twins contrastive learning to NIR spectroscopy using repeated measurements from different devices. • Semi-supervised learning allows for unlabelled spectra to assist in training. • Up to 50% reduction in RMSE in calibration transfer tasks compared to training on the MSE loss only. • The best performance gains were observed at small labelled training sizes. • The Barlow Twins loss was not detrimental at large training set sizes.
Wohlers et al. (Sat,) studied this question.