What does this research mean for the field?

While Transformer models like PatchTST achieve higher overall accuracy for corn yield prediction, RNN models like LSTM and GRU are more robust and conservative when predicting yield anomalies during extreme weather events. Novelty: ClaimNovelty.INCREMENTAL. Consensus alignment: ConsensusAlignment.NEUTRAL.

What question did this study set out to answer?

This research aims to evaluate the predictive performance and sensitivity of Transformer and RNN models for corn yield forecasting.

May 17, 2026Open Access

A Comprehensive Comparison of Transformer and RNN Models for Corn Yield Prediction: A Case Study of the U.S. Corn Belt

Key Points

This research aims to evaluate the predictive performance and sensitivity of Transformer and RNN models for corn yield forecasting.
Compared PatchTST, LSTM, and GRU models under identical training and tuning conditions.
Implemented a three-step sensitivity testing framework to assess growth-stage vulnerability and stress responses based on environmental conditions.
Analyzed predictive accuracy on county-level corn yield data from the U.S. Corn Belt over the years 2017–2020.
PatchTST achieved the highest accuracy with R²=0.805 and RMSE=12.79 bu/acre, outperforming LSTM (R²=0.773, RMSE=13.80) and GRU (R²=0.758, RMSE=14.23).
Sensitivity analysis revealed that PatchTST overreacts to climate extremes, while LSTM and GRU provided more conservative estimates during the 2019 yield anomaly.
Spatial analyses showed PatchTST generates uniform errors across regions, while RNNs exhibit distinct geographic variations.

Abstract

• PatchTST, LSTM, and GRU are compared under identical training and tuning. • A 3-step framework tests growth-stage vulnerability and stress responses. • PatchTST achieves best accuracy on 2017–2020 (R²=0.805; RMSE=12.79). • PatchTST overreacts to extremes and overestimates the 2019 wet-event loss. • RNNs are more conservative and better match the observed 2019 yield anomaly. Accurate corn yield prediction is essential for agricultural management and environmental risk assessment. Deep learning approaches, especially recurrent neural networks (RNNs) and Transformer models, have shown strong potential for learning yield responses from multivariate environmental time series. However, head-to-head comparisons under consistent training, tuning, and evaluation protocols, particularly those that explicitly test sensitivity to extreme weather, remain limited. This study systematically evaluates one Transformer-based model (PatchTST) and two RNN-based models (LSTM and GRU) for county-level corn yield prediction across the U.S. Corn Belt. Beyond baseline accuracy, we propose a three-step sensitivity testing framework that (1) screens growth-stage vulnerability, (2) quantifies heat- and drought-related stress–response relationships, and (3) validates perturbation behavior using the documented 2019 extreme wet event. On the test set, PatchTST achieves the strongest overall predictive skill, with the highest R² (0.805) and lowest RMSE (12.79 bu/acre), outperforming LSTM (R²=0.773, RMSE=13.80 bu/acre) and GRU (R²=0.758, RMSE=14.23 bu/acre). Sensitivity results show substantial divergence under climate extremes: PatchTST responds more strongly to severe heat and drought and overestimates losses during the 2019 wet-event reconstruction, whereas LSTM and GRU exhibit more conservative responses and better reproduce the observed 2019 yield anomaly. Spatial analyses further indicate that PatchTST produces more spatially uniform errors, while RNNs show clearer regional gradients and stronger geographic contrasts. Overall, model architecture affects not only predictive accuracy but also robustness and interpretability under extreme conditions. Transformers are promising for high-accuracy forecasting, while RNNs are preferable for conservative, operational decisions when rare extremes and uncertainty are key concerns.

A Comprehensive Comparison of Transformer and RNN Models for Corn Yield Prediction: A Case Study of the U.S. Corn Belt

Key Points

Abstract

Cite This Study