Machine learning approaches to wildfire spread prediction are constrained by the lack of standardized, multi-source, spatiotemporal datasets that fuse terrain, weather, and fire-state information into a single ML-ready format. We present WildfireCube, a reproducible event-centric pipeline and methodology for constructing dense fourth-order spatiotemporal tensors of shape (T, C, H, W) at 30 m spatial and 3 h temporal resolution. Following the analysis-ready data convention established in the Earth Observation community, the pipeline fuses four open data sources: the Copernicus GLO-30 Digital Elevation Model for static terrain derivatives, ERA5-Land reanalysis for hourly weather forcing, Sentinel-2 Level-2A imagery for spectral vegetation and burn-severity indices, and NASA FIRMS active-fire hotspot detections for fire-state reconstruction via ordinary kriging. The resulting 13-channel normalized tensor separates causal drivers into three physically motivated groups: static landscape controls (elevation, slope, aspect, fuel load), dynamic atmospheric forcings (wind components, temperature, precipitation), and evolving fire state (fire-front mask, burn severity, fractional burn, observation confidence). A physics-informed normalization framework maps all channels to bounded ranges using fixed physical constants rather than sample statistics, ensuring cross-event comparability and exact invertibility. We demonstrate the pipeline on 13 wildfire events across the United States, Canada, and Greece (2017–2023), producing a processed catalog exceeding 300 GB compressed and spanning a 14-fold range in burned area, a 27 °C range in mean temperature, and different fire regimes. Event tensors are stored in chunked Zarr archives with Zstandard compression, achieving a 2.58× compression ratio. As future work, the pipeline will be applied to a 40-event target catalog projected to exceed 2 TB of raw data, providing the multi-regime diversity and scale required for training robust deep learning models for spatiotemporal wildfire prediction.
Linardos et al. (Fri,) studied this question.