Spring frosts pose a major threat to tea production, causing severe damage to tender spring buds and substantial economic losses. To support timely frost protection measures, this study develops an interpretable machine learning framework for next-day frost forecasting in a tea plantation in Danyang, eastern China. Leveraging nine years (2008–2016) of multi-source data—including high-resolution on-site meteorological observations and daily records from surrounding regional stations—we engineered a comprehensive set of predictive features capturing local microclimatic, regional synoptic, and short-term temporal dynamics. A two-stage feature selection approach, combining Spearman correlation screening with SHAP-based importance ranking, identified an optimal subset of 14 robust predictors. Among eight benchmarked models, XGBoost achieved the best performance on a chronologically held-out test set, yielding a CSI of 0.736, accuracy of 91.0%, F1-Score of 0.848 and AUC-ROC of 0.968. Ablation experiments demonstrated the added value of data integration: model performance improved from a CSI of 0.617 (using only local data) to 0.736 (with full multi-source inputs). SHAP interpretability analysis further revealed that the model’s predictions align with established frost formation physics, highlighting key drivers such as nocturnal cooling rate and regional humidity. This work demonstrates that integrating multi-scale meteorological data with interpretable machine learning offers a reliable, transparent, and operationally viable tool for frost risk management—providing actionable insights to enhance resilience in precision horticulture for perennial crops like tea.
Zhang et al. (Sun,) studied this question.