This study introduces a hybrid machine learning framework for modeling multi-scale acoustic wavefields in randomly layered atmospheric media. The approach accounts for both the deterministic variability associated with large-scale ducting structures, as captured by atmospheric reanalyses (e.g., ERA5), and the stochastic, fine-scale fluctuations driven by gravity waves (GWs). To capture this complex variability, we combine Fourier Neural Operators (FNOs), which model the large-scale, flow-dependent response of the atmosphere, with diffusion models conditioned on FNO outputs to reconstruct fine-scale structures. The FNO effectively learns the dominant propagation regimes and reflects the extreme sensitivity to GWs that arises when the effective sound speed ratio approaches unity. Applied to a decade of infrasound recordings from stations located several hundred kilometers from controlled explosions at the Hukkakero military range in Finland, the hybrid framework significantly improves both the spectral and the temporal resolution of the predicted waveforms. A key contribution of this work lies in its modal interpretation: space–time decomposition reveals that the diffusion model systematically restores higher-order acoustic modes linked to GW–infrasound coupling, modes otherwise suppressed by the spectral bias of neural operators. This framework opens new directions for data-driven surrogate modeling of long-range infrasound propagation in turbulent and stochastically perturbed media.
Millet et al. (Wed,) studied this question.