What question did this study set out to answer?

The aim is to establish a benchmarking framework to analyze various deep learning models for long-term geospatial predictions.

June 14, 2026Open Access

Simulation of long-term spatio-temporal environmental dynamics using a unified benchmark of neighbor augmenting, LSTM and graph attention models

Key Points

The aim is to establish a benchmarking framework to analyze various deep learning models for long-term geospatial predictions.
Utilized a unified preprocessing and evaluation pipeline on annual satellite data from 2000 to 2023, reserving 2024 for testing.
Compared Long Short-Term Memory models with auxiliary variables against hybrid Graph Attention Network–LSTM and fully attention-based models.
Evaluated model performance using global measures like R², RMSE, MAE, MAPE, and correlation.
LSTM–CA with auxiliary inputs achieved the highest performance (R² ≈ 0.95), demonstrating stability and effectiveness.
The GAT–Temporal Attention model with an MLP block performed second best, while removing the MLP led to unstable results.
Vegetation and water-related indices showed higher predictability, emphasizing the importance of auxiliary data over model complexity.

Abstract

Rapid environmental change has increased the need for predicting the long-term geospatial reliably. However, accurately modeling spatio-temporal geospatial dynamics remains challenging Because of the nonlinearities, complex spatial dependency, and external driving factors, it is difficult to predict. In this paper, a comprehensive benchmarking framework is proposed for the comparison of neighborhood-based, graph-based and attention-based spatiotemporal deep learning models, with the same preprocessing, training and testing procedure.. Long Short-Term Memory (LSTM) models with and without auxiliary variables are compared with hybrid Graph Attention Network–LSTM (GAT–LSTM) models and fully attention-based GAT–Temporal Attention models, with and without a feed-forward (MLP/FFN) block. All models are trained using a unified preprocessing and evaluation pipeline on annual satellite data in Network Common Data Form (NetCDF) from 2000 to 2023, with 2024 reserved as a fully unseen test dataset. Global pixel-wise measures such as R 2 , RMSE, MAE, MAPE, and correlation are used to evaluate model performance based on performance of vectors and alignment of predicted vectors and reference vectors. . Findings indicate that the LSTM–CA with auxiliary inputs (3 × 3 neighborhood) performs the best and most stable performance (R 2 ≈ 0.95), highlighting the importance of the integrated Cellular Automata (CA) structure and auxiliary driving factors. The GAT–Temporal Attention model with an MLP block ranks second, while removing the MLP or using hybrid LSTM–GAT configurations lead to unstable or degraded performance. Index-wise analysis shows that vegetation and water-related indices are more predictable. The results indicate that strong temporal modeling of information combined with auxiliary information is more important than complexity of spatial attention. The main novelty of this paper is that it does not introduce a new model for a neural network, instead it proposes a comparative engineering experiment to assess the conditions where neighborhood-based temporal models could be superior to graph-attention models in geospatial long-range prediction applications.

Bookmark

View Full Paper

Cite This Study

Karimadini et al. (Fri,) studied this question.

synapsesocial.com/papers/6a2e47cdb1cc60ccdea8c34d https://doi.org/https://doi.org/10.1038/s41598-026-56762-5

Bookmark

View Full Paper