Land surface temperature (LST) is essential for studying land–atmosphere energy exchange, the impact of climate change, and its influence on crop yields and hydrology. Although satellite remote sensing provides large-scale LST data, existing spatiotemporal fusion methods face challenges. Traditional algorithms have difficulty with heterogeneous surfaces, and deep-learning models often produce blurred details and inaccurate temperatures, which limits their use in high-precision applications. This study addresses these issues by developing a Deep-Learning Spatial and Temporal Fusion Model (DLSTFM) for Landsat-8 and MODIS LST imagery in Griffith, Australia. DLSTFM employs a dual-branch structure: one branch is dedicated to dual-temporal fusion, and the other branch is dedicated to multi-source feature fusion. Key innovations include the Spatial Adaptive Feature Modulation (SAFM) module, which performs adaptive multi-scale feature fusion, and the Temperature Adaptive Correction Module (TCM), which makes pixel-wise adjustments using reference data. Experiments demonstrate that DLSTFM significantly outperforms traditional methods and existing deep-learning fusion methods. DLSTFM achieves clearer surface features and a mean absolute temperature error of approximately 2.1 K. The model also demonstrated excellent generalization performance in another test area (Ardiethan) without retraining, showcasing its substantial practical value for high-accuracy LST fusion.
Jin et al. (Mon,) studied this question.