Key points are not available for this paper at this time.
Polymer classification from spectroscopic data is challenging because of the inherent difficulty of spectral overlap and matrix interferences, necessitating the enhancement of spectral quality for precise analytical results. This study presents a data augmentation framework for microplastic identification by integrating raw, mixture, and hybrid Fourier transform infrared (FTIR) spectral datasets. This approach enhances dataset diversity and provides a basis for improving model generalization when applied to spectral variations in environmental samples. Classification and regression models employing six preprocessing techniques, namely polynomial fitting, linear correction, standard normal variate, Savitzky–Golay filtering, Gaussian smoothing, and moving average filters, were compared to improve the accuracy of microplastic classification for eight commonly used plastics: polyvinyl chloride, high-density polyethylene, low-density polyethylene, polypropylene, polystyrene, polyethylene terephthalate, polyethylene, and polyamide. The results indicated that the logistic regression model achieved near-optimal values for accuracy, precision, recall, and F1 score, effectively generalizing under challenging conditions with accuracies of 96.14% for the hybrid FTIR dataset. Validation using an external real-world environmental spectral dataset yielded an accuracy of 93.51%, further confirming the model’s ability to generalize and capture meaningful spectral variability. The explainable artificial intelligence techniques, including Shapley additive and local interpretable model-agnostic explanations, which were used to identify the spectral regions that significantly contribute to classification, provided chemically meaningful interpretations of model decisions. MPsSpecXAI leverages data fusion to automate and improve microplastic classification in environmental samples.
Sukkuea et al. (Wed,) studied this question.