What question did this study set out to answer?

The aim is to accelerate the sampling process in flow-based image generation models while maintaining or enhancing quality.

February 19, 2026

FlowTurbo: Accelerating Flow-based Image Generation Models via Multi-stage Refinement

Key Points

The aim is to accelerate the sampling process in flow-based image generation models while maintaining or enhancing quality.
Proposed the FlowTurbo framework for improved sampling speed and quality in flow-based models.
Developed techniques like a pseudo corrector and sample-aware compilation to lower inference time.
Implemented a multi-stage refinement process to partition generation across different resolutions.
Achieved acceleration ratios of 53.1% to 58.3% in class-conditional generation and 29.8% to 38.5% in text-to-image generation.
FlowTurbo attained a Fréchet Inception Distance (FID) of 2.12 on ImageNet with 100 ms/image and 3.93 with 38 ms/image, setting a new state-of-the-art.
Real-time image generation was achieved with significant speed improvements on NVIDIA 3090 GPU.

Abstract

Building on the success of diffusion models in visual generation, flow-based models reemerge as another prominent family of generative models that have achieved competitive or better performance in terms of both visual quality and inference speed. By learning the velocity field through flow-matching, flow based models tend to produce a straighter sampling trajectory, which is advantageous during the sampling process. However, unlike diffusion models for which fast samplers are well-developed, efficient sampling of flow-based generative models has been rarely explored. In this paper, we propose a framework called FlowTurbo to accelerate the sampling of flow-based models while still enhancing the sampling quality. Our primary observation is that the velocity predictor's outputs in the flow-based models will become stable during the sampling, enabling the estimation of velocity via a lightweight velocity refiner. Additionally, we introduce several techniques including a pseudo corrector and sample-aware compilation to further reduce inference time. Since FlowTurbo does not change the multi-step sampling paradigm, it can be effectively applied for various tasks such as image editing, inpainting, etc. Besides, we propose a new multi-stage refinement technique that is designed to reduce the inference costs with large flow-based image generation models. Specifically, the multi-stage refinement split the whole generation procedure on different resolutions, forming a coarse-to-fine text-to-image pipeline. We further adopt a stage-aware deployment strategy that can maximize the inference speed in terms of both latency and throughput. By integrating FlowTurbo into different flow based models, we obtain an acceleration ratio of 53.1%∼58.3% on class-conditional generation and 29.8%∼38.5% on text-to image generation. Notably, FlowTurbo reaches an FID of 2.12 on ImageNet with 100 (ms / img) and FID of 3.93 with 38 (ms / img), achieving the real-time image generation and establishing the new state-of-the-art. Equipped with the recent SD 3.5 Large, we achieved FID of 28.05 with a speed improvement of around 50% on NVIDIA 3090 GPU. Code is available at https://github.com/shiml20/FlowTurbo.

Ask AI

Mark Helpful

Bookmark

Relay