• PatchTST, LSTM, and GRU are compared under identical training and tuning. • A 3-step framework tests growth-stage vulnerability and stress responses. • PatchTST achieves best accuracy on 2017–2020 (R²=0.805; RMSE=12.79). • PatchTST overreacts to extremes and overestimates the 2019 wet-event loss. • RNNs are more conservative and better match the observed 2019 yield anomaly. Accurate corn yield prediction is essential for agricultural management and environmental risk assessment. Deep learning approaches, especially recurrent neural networks (RNNs) and Transformer models, have shown strong potential for learning yield responses from multivariate environmental time series. However, head-to-head comparisons under consistent training, tuning, and evaluation protocols, particularly those that explicitly test sensitivity to extreme weather, remain limited. This study systematically evaluates one Transformer-based model (PatchTST) and two RNN-based models (LSTM and GRU) for county-level corn yield prediction across the U.S. Corn Belt. Beyond baseline accuracy, we propose a three-step sensitivity testing framework that (1) screens growth-stage vulnerability, (2) quantifies heat- and drought-related stress–response relationships, and (3) validates perturbation behavior using the documented 2019 extreme wet event. On the test set, PatchTST achieves the strongest overall predictive skill, with the highest R² (0.805) and lowest RMSE (12.79 bu/acre), outperforming LSTM (R²=0.773, RMSE=13.80 bu/acre) and GRU (R²=0.758, RMSE=14.23 bu/acre). Sensitivity results show substantial divergence under climate extremes: PatchTST responds more strongly to severe heat and drought and overestimates losses during the 2019 wet-event reconstruction, whereas LSTM and GRU exhibit more conservative responses and better reproduce the observed 2019 yield anomaly. Spatial analyses further indicate that PatchTST produces more spatially uniform errors, while RNNs show clearer regional gradients and stronger geographic contrasts. Overall, model architecture affects not only predictive accuracy but also robustness and interpretability under extreme conditions. Transformers are promising for high-accuracy forecasting, while RNNs are preferable for conservative, operational decisions when rare extremes and uncertainty are key concerns.
Xin et al. (Fri,) studied this question.