What does this research mean for the field?

Imputation techniques, particularly LSTM and time-based interpolation, significantly improve the accuracy of predicting air and surface temperatures in Kuwait using multi-dimensional high-frequency climate data. Novelty: ClaimNovelty.INCREMENTAL. Consensus alignment: ConsensusAlignment.NEUTRAL.

What question did this study set out to answer?

The aim is to assess how different imputation techniques improve the accuracy of predictions in high-frequency climate data.

February 28, 2026Open Access

Imputation of Multi-Dimensional High-Frequency Climate Data to Predict Air and Surface Temperatures in Kuwait

Puntos clave

The aim is to assess how different imputation techniques improve the accuracy of predictions in high-frequency climate data.
Evaluated three traditional imputation methods: mean, k-nearest neighbor, and time-based interpolation.
Introduced a new approach using random forest, LSTM, and Transformer regression methods.
Implemented a leave-one-year-out cross-validation strategy.
All imputation methods showed improved performance compared to no imputation.
LSTM and time-based interpolation were identified as the most effective combination.
Imputation based on previous years' data did not perform well.

Resumen

Missing values may arise in climate data collection due to sensor malfunction, transmission errors, device calibration and operational issues. This problem can be more catastrophic in the case of multi-dimensional and high-frequency climate data sets, where some or all climate readings could be missing at multiple timestamps. These missing data in high-frequency climate modeling could lead to inaccurate prediction models, which in turn affect overall assessments, planning, and climate-related measures and policy. In this paper, we evaluate the performance of three imputation techniques based on the mean, k-nearest neighbor, time-based interpolation and a new temporal cross-year climate imputation approach using a random forest, long short-term memory (LSTM) model and contextual embedding-based Transformer regression methods. We discussed our findings on four years of multi-output, high-frequency and multi-dimensional climate data collected in Kuwait. Using a leave-one-year-out cross-validation approach, our results show that all imputation methods perform better than no imputation, with LSTM and time-based interpolation emerging as the best combination. Imputing climate data based on previous years’ timestamps did not yield good results, highlighting the variability of climate data across years.

Me gusta

Guardar

Ver artículo completo

Cite This Study

Khan et al. (Wed,) studied this question.

synapsesocial.com/papers/69a287690a974eb0d3c03117 https://doi.org/https://doi.org/10.3390/info17030221

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Me gusta

Guardar

Ver artículo completo