The identification of biological products, such as vaccines, blood components and recombinant therapeutic proteins, involve diverse challenges. This study introduces a novel method that employs Raman spectroscopy with a transformer model enhanced by helix matrix transformation for the robust identification of 37 distinct biological products. To satisfy the data volume requirements for model training, we generated 133,200 simulated spectra using an innovative approach. The training involved the development of 16,000 models with varying layer configurations and the application of either helix matrix transformation or reshape operation during each epoch. The optimal model was achieved after 109 epochs of training and employed for subsequent evaluation. Through rigorous testing, our method demonstrated robustness against spectral peak drift, random noise and fluorescence interference. The area under the curve values for the helix matrix transformation method, reshape operation and analysis via OMNIC v8.3 software's built-in method, support vector machine (SVM) model, and long short-term memory (LSTM) model were 1.000, 0.886, 0.698, 0.932 and 0.882, respectively. The average accuracy percentage were 100.0, 96.35 ± 18.76, 61.08 ± 48.79, 93.92 ± 23.91 and 86.76 ± 33.92, respectively. The present model was successfully applied to the identification of 53 different injectable drugs. These results indicate that helix matrix transformation combined with the transformer model enables the effective identification of biological products based on Raman spectra. • Novel Methodology : Introduction of a new approach utilizing Raman spectroscopy combined with a transformer model enhanced by helix matrix transformation for identifying 37 biological products. This method provides a new solution for the detection of analogs based on Raman spectra. • Data Generation: Creation of 133,200 simulated spectra to meet model training data volume requirements using innovative techniques. This provides a method for constructing virtual Raman spectra that enhances data diversity. • Robust Performance: The method exhibited strong resilience against spectral peak drift, random noise, and fluorescence interference during testing. • New data processing method : Results suggest that the combination of helix matrix transformation and transformer models significantly enhances the effective identification of biological products through Raman spectra analysis.
Ling et al. (Sun,) studied this question.