Los puntos clave no están disponibles para este artículo en este momento.
Accurately forecasting near-term wildfire spread from aerial imagery is critical for emergency response. However, it remains challenging because fire growth is driven by complex factors like wind, terrain, and fuel variability. Most prior work focused on per-frame fire detection or mapping rather than true spread prediction and often ignored environmental dynamics. As a result, purely data-driven models have struggled to generalize and can produce physically inconsistent fire spread forecasts under changing wind or terrain conditions. To address these issues, we propose FireCast-Fusion, a deep learning framework that fuses multimodal UAV imagery with environmental data for short-horizon wildfire forecasting. The model integrates a temporal transformer encoder (TTE) to capture the evolving fire-front dynamics from sequential UAV frames, and a physics-guided diffusion (PGDL) module that incorporates wind direction and slope information to enforce realistic spread behavior. By combining visual cues with physical constraints, FireCast-Fusion produces both probabilistic fire-front maps and per-pixel arrival-time estimates for the advancing wildfire. The system is trained with a multi-objective loss that jointly optimizes fire segmentation accuracy, arrival-time prediction, and physical consistency, ensuring the model learns to balance predictive performance with adherence to real-world spread dynamics. Experimental results show that our approach outperforms conventional 3D CNN and LSTM-based models, improving fire-front intersection-over-union and arrival-time error by up to 18%. The hybrid transformer–diffusion design also yields more stable predictions under varying wind conditions and rugged topography, producing interpretable spread trajectories that align closely with observed fire behavior. • A multimodal framework integrating UAV RGB–thermal imagery with environmental factors such as wind, terrain, and vegetation for wildfire spread prediction. • A hybrid architecture combining a temporal transformer with a physics-guided diffusion layer to capture spatiotemporal dynamics and enforce physically consistent propagation. • A dual-output design for simultaneous estimation of fire probability and per-pixel arrival time. • A unified multi-objective training strategy balancing segmentation accuracy, temporal consistency, and physical constraints. • Improved performance over baseline models across multiple UAV datasets, demonstrating robust generalization and stable prediction under varying conditions.
Abbas et al. (Wed,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: