What question did this study set out to answer?

The research aims to improve the grading of flue-cured tobacco through machine learning and hyperspectral imaging.

February 27, 2026Open Access

Enabling rapid and accurate grand discrimination of flue-cured tobacco: a near-infrared hyperspectral and machine learning approach

Key Points

The research aims to improve the grading of flue-cured tobacco through machine learning and hyperspectral imaging.
Developed a machine learning model using near-infrared hyperspectral data.
Utilized multivariate statistical analysis to find correlations between grade and chemical composition.
Employed three preprocessing methods and four classification models to enhance grading accuracy.
Investigated the impact of characteristic bands on classification performance.
Achieved a classification accuracy of 98.5% with full-band data and PLS-DA model.
Obtained 94.0% accuracy with 70% of bands selected via the successive projections algorithm.
Found significant correlations between tobacco grade and several chemical constituents like nicotine and sugars.

Abstract

To address the inefficiency and subjectivity of manual grading, this study established a machine learning model based on near-infrared hyperspectral data (950–1650 nm) for the accurate classification of first-roasted tobacco grades. Multivariate statistical analysis uncovered the intrinsic correlations among grade, spectral data, and chemical composition, thereby laying a theoretical foundation for hyperspectral-based grading technology. Three preprocessing methods (namely, multiplicative scatter correction (MSC), standard normal variate transformation, and Savitzky–Golay convolutional smoothing) and four classification models (namely, random forest, backpropagation neural network, extreme learning machine, and partial least squares–discriminant analysis (PLS-DA)) were employed. Moreover, characteristic bands were selected through the successive projections algorithm (SPA) and competitive adaptive reweighted sampling to investigate how the number of characteristic bands affects the grade classification accuracy. The results showed that rank exhibited highly significant correlations with nicotine, reducing sugars, total sugars, and sugar-nicotine ratio, and that spectra exhibited highly significant correlations with nicotine. The classification accuracy of full-band MSC preprocessing combined with the PLS-DA model reached 98.5%, while the classification accuracy reached 94.0% when using 70% of the full bands selected using the SPA. In conclusion, near-infrared hyperspectroscopy combined with machine learning not only offers high efficiency, accuracy, and non-destructiveness in the grading of first-roasted tobacco leaves but also provides a theoretical basis for industrial hyperspectral grading by elucidating the correlations among spectrum, chemical composition, and grade. This method avoids the subjectivity of manual grading and offers key technical support to advance the intelligence and automation of first-roasted tobacco leaf grading in the tobacco industry.

Bookmark

View Full Paper

Bookmark

View Full Paper

Enabling rapid and accurate grand discrimination of flue-cured tobacco: a near-infrared hyperspectral and machine learning approach

Key Points

Abstract

Cite This Study