What type of study is this?

This is a Cohort Study study (also classified as: Quantitative Study).

October 11, 2025Open Access

A Comparative Assessment of Regular and Spatial Cross-Validation in Subfield Machine Learning Prediction of Maize Yield from Sentinel-2 Phenology

Key Points

Spatial cross-validation led to more accurate model performance estimates compared to regular cross-validation.
EVI-based models provided more reliable yield predictions than those based on WDRVI in the study.
The analysis utilized 10-fold cross-validation methods and high-resolution data from eastern Croatia.
Ignoring spatial structure can generate misleading conclusions about the accuracy and generalizability of predictive models.

Abstract

The aim of this study is to determine the reliability of regular and spatial cross-validation methods in predicting subfield-scale maize yields using phenological measures derived by Sentinel-2. Three maize fields from eastern Croatia were monitored during the 2023 growing season, with high-resolution ground truth yield data collected using combine harvester sensors. Sentinel-2 time series were used to compute two vegetation indices, Enhanced Vegetation Index (EVI) and Wide Dynamic Range Vegetation Index (WDRVI). These features served as inputs for three machine learning models, including Random Forest (RF) and Bayesian Generalized Linear Model (BGLM), which were trained and evaluated using both regular and spatial 10-fold cross-validation. Results showed that spatial cross-validation produced a more realistic and conservative estimate of the performance of the model, while regular cross-validation overestimated predictive accuracy systematically because of spatial dependence among the samples. EVI-based models were more reliable than WDRVI, generating more accurate phenomenological fits and yield predictions across parcels. These results emphasize the importance of spatially explicit validation for subfield yield modeling and suggest that overlooking spatial structure can lead to misleading conclusions about model accuracy and generalizability.

A Comparative Assessment of Regular and Spatial Cross-Validation in Subfield Machine Learning Prediction of Maize Yield from Sentinel-2 Phenology

Key Points

Abstract

Cite This Study