What type of study is this?

This is a Experimental Study study.

September 30, 2025Open Access

InfGen: A Resolution-Agnostic Paradigm for Scalable Image Synthesis

THTao HanHong Kong University of Science and Technology WXWanghan XuBeijing Academy of Artificial Intelligence JGJunchao GongShanghai Jiao Tong University

Key Points

InfGen reduces 4K image generation time to under 10 seconds, drastically cutting delay times.
Experiments demonstrate that InfGen enhances various image generation models for arbitrary resolutions.
The proposed generator simplifies the process by eliminating the need for retraining diffusion models.
By leveraging latent space representation, InfGen provides a scalable solution for high-resolution image synthesis.

Abstract

Arbitrary resolution image generation provides a consistent visual experience across devices, having extensive applications for producers and consumers. Current diffusion models increase computational demand quadratically with resolution, causing 4K image generation delays over 100 seconds. To solve this, we explore the second generation upon the latent diffusion models, where the fixed latent generated by diffusion models is regarded as the content representation and we propose to decode arbitrary resolution images with a compact generated latent using a one-step generator. Thus, we present the InfGen, replacing the VAE decoder with the new generator, for generating images at any resolution from a fixed-size latent without retraining the diffusion models, which simplifies the process, reducing computational complexity and can be applied to any model using the same latent space. Experiments show InfGen is capable of improving many models into the arbitrary high-resolution era while cutting 4K image generation time to under 10 seconds.

Ask AI

Helpful

Bookmark

View Full Paper