What question did this study set out to answer?

The central aim is to evaluate machine-learning models for one-step-ahead agroclimatic forecasting under degraded sensor-data conditions.

May 21, 2026Open Access

Agroclimatic Forecasting Under Degraded Sensor Data: A Robustness Benchmark of Machine-Learning Models

Key Points

The central aim is to evaluate machine-learning models for one-step-ahead agroclimatic forecasting under degraded sensor-data conditions.
Used a real meteorological dataset from a field weather station in Ukraine.
Evaluated twelve regression models under five controlled degradation scenarios: baseline, missing values, noise, reduced training history, and combined degradation.
Compared forecasting performance metrics, including mean absolute error (MAE) and R2 values.
Ridge Regression achieved the strongest baseline temperature-forecasting performance with MAE = 0.318 and R2 ≈ 0.98 under clean data.
Ridge Regression retained R2 > 0.90 using only 50% of the training data available.
Under noise conditions, Ridge Regression and HistGradientBoosting maintained R2 values of 0.95-0.97; under combined degradation, HistGradientBoosting retained R2 > 0.85.

Abstract

Reliable short-term agroclimatic forecasting is essential for precision agriculture, irrigation planning, disease-risk assessment, and microclimatic decision support. However, field-deployed sensor systems often operate under degraded data conditions, including missing measurements, noise, temporal interruptions, and limited local computational resources. These constraints make it necessary to evaluate not only forecasting accuracy under clean data, but also model robustness under realistic sensor-data degradation. The objective of this study is to compare machine-learning models for one-step-ahead agroclimatic time-series forecasting under degraded sensor-data conditions. Using a real meteorological dataset collected by a field weather station in the Dnipro region of Ukraine, twelve regression models were evaluated: Ridge Regression, Random Forest, Extra Trees, Gradient Boosting, HistGradientBoosting, Support Vector Regression, Linear SVR, KNN, PLSRegression, ElasticNet, Lasso, and MultiTaskElasticNet. The models were tested under five controlled scenarios: baseline data, missing values, additive noise, reduced training history, and combined noise–missingness degradation. Quantitatively, Ridge Regression achieved the strongest baseline temperature-forecasting performance, with MAE = 0.318 and R2 ≈ 0.98 under clean data. It also maintained R2 > 0.90 when trained on only 50% of the available history. Under Gaussian noise with σ = 0.05–0.10, Ridge Regression and HistGradientBoosting maintained R2 values in the range of 0.95–0.97, whereas under combined degradation with σ = 0.10 and 20% missing data, HistGradientBoosting retained R2 > 0.85. These findings indicate that machine-learning models differ substantially in their sensitivity to sensor-data degradation and that robustness-oriented benchmarking is necessary before selecting models for agroclimatic forecasting systems.

Agroclimatic Forecasting Under Degraded Sensor Data: A Robustness Benchmark of Machine-Learning Models

Key Points

Abstract

Cite This Study