ABSTRACT To address the fraud in the rice market and the limitations of traditional traceability methods, this work proposes a novel framework integrating gas chromatography–mass spectrometry‐based metabolomics with machine learning and heuristic feature extraction. A total of 190 japonica rice samples from six origins in Heilongjiang Province were analyzed, yielding 46 metabolite features. After mahalanobis distance quality control to eliminate outliers, 120 valid samples were retained and visualized via uniform manifold approximation and projection. Six representative machine learning algorithms were systematically evaluated, and genetic algorithm and simulated annealing were employed for feature optimization. The results show that the random forest algorithm combined with genetic algorithm achieved the highest performance (validation accuracy = 99.5%, AUC = 0.998, F‐measure = 0.995), outperforming existing spectral and isotope‐based methods. Twenty‐eight key metabolites were identified, each closely linked to origin‐specific environmental factors. Statistical tests confirmed significant performance differences between algorithms. This work provides a robust, interpretable, and cost‐effective solution for rice origin traceability, with implications for food safety supervision and high‐quality agricultural product authentication.
Yu et al. (Thu,) studied this question.