What question did this study set out to answer?

This research aims to improve short-term precipitation forecasting by using a new predictive network that combines spatiotemporal modeling and diffusion refinement.

May 16, 2026Open Access

Short-Term Precipitation Forecast Based on Diffusion Spatiotemporal Network

Key Points

This research aims to improve short-term precipitation forecasting by using a new predictive network that combines spatiotemporal modeling and diffusion refinement.
Proposes a ViT-modulated diffusion spatiotemporal prediction network (VSTPN) combining two modules.
Evaluates the model on the HKO-7 benchmark to compare its performance with existing baselines.
Assesses various metrics including MSE, SSIM, CSI, HSS, POD, and FAR.
VSTPN achieves lower mean squared error (MSE) and higher structural similarity index (SSIM) compared to baselines.
At the 40 dBZ threshold, VSTPN improves critical success index (CSI), Heidke skill score (HSS), and probability of detection (POD) metrics.
The false alarm ratio (FAR) is slightly higher than that of ETCJ-PredNet, suggesting a trade-off in detection accuracy.

Abstract

Short-term precipitation forecasting is essential for disaster prevention, urban management, and weather-sensitive decision making, yet radar-based nowcasting remains challenging because precipitation systems evolve nonlinearly and high-frequency echo structures are easily over-smoothed by deterministic sequence models. This paper proposes a ViT-modulated diffusion spatiotemporal prediction network (VSTPN) that cascades a spatiotemporal prediction module with a ViT-conditioned diffusion refinement module. The spatiotemporal module models the temporal evolution of radar echoes, whereas the ViT-Diffusion module uses global contextual features as conditional guidance during iterative denoising to refine spatial structures. Experiments on the HKO-7 benchmark show that VSTPN achieves lower MSE and higher SSIM than the tested baselines and improves CSI, HSS, and POD at the evaluated reflectivity thresholds. At the 40 dBZ threshold, the model improves CSI, HSS, and POD, while its FAR is slightly higher than that of ETCJ-PredNet, indicating a recall–false alarm trade-off for intense echoes. Additional post-hoc diagnostic analyses of relative gains, metric consistency, threshold sensitivity, and component effect sizes further support the stability of the reported improvements under the current experimental protocol. The results suggest that coupling spatiotemporal sequence modeling with diffusion-based radar echo refinement is a feasible direction for short-term precipitation forecasting; nevertheless, probabilistic uncertainty evaluation, multi-domain validation, and additional generative-quality metrics remain important directions for future work.

Short-Term Precipitation Forecast Based on Diffusion Spatiotemporal Network

Key Points

Abstract

Cite This Study