This article presents a deep learning framework that links microalgae extract spectra to the quantification of individual pigments. To do so, it relies on Convolution Neural Network architectures. First, architectures from the literature were implemented and challenged. While showing good results for most pigments, they failed to predict adequately the zeaxanthin concentration (present in a very low amount, MAPE above 20%). Consequently, a specific network was designed. Upon finetuning, it reached 14.6% MAPE on the validation set. In addition to network architecture, data augmentation and preprocessing were explored. The results show that data augmentation by derivation alone (without extra preprocessing) yields the best results. Finally, the correlation between training dataset size and performance was investigated. Using the newly introduced learning curve tool, it was possible to evaluate the best achievable performance (3.10 to 8.57% MAPE) and convergence rate (approximately square root to quadratic, pigment-dependent) for the major pigments. The code and the database are available on GitHub, and the trained CNNs are available on HuggingFace. (to be adjusted upon article acceptance)
Bayomie et al. (Sun,) studied this question.