Diffusion models have become the core generative paradigm across image, video, audio, and text synthesis. However, their multi-step iterative sampling leads to slow inference, limiting real-time and large-scale deployment. This survey systematically reviews acceleration techniques for diffusion model sampling. The paper first introduces the unified theoretical framework of stochastic differential equations and probability flow ordinary differential equations. The paper then analyses three key acceleration approaches, deterministic ODE solvers with schedule optimization, knowledge distillation and consistency models, and training-free methods with hardware co-design. The paper also discusses architectural evolution from U-Net to Transformer, and emerging paradigms like flow matching and rectified flow. Finally, the paper summarizes acceleration practices in advanced multimodal applications and outline future research directions. As application scenarios continue to expand towards higher real-time demands and fidelity, sampling efficiency has become a critical bottleneck for the practical deployment of diffusion models. By delineating the theoretical underpinnings and technical landscape, this review aims to provide a structured reference and forward-looking insights for research in efficient generative modeling, thereby advancing generative AI towards more practical and controllable frontiers.
Yuhan Hei (Mon,) studied this question.