What question did this study set out to answer?

The research aims to enhance perceptual quality and structural detail in image generation using a novel GAN framework.

April 17, 2026

Latent Diffusion‐GAN: Adversarial Learning in the Autoencoded Latent Space

Key Points

The research aims to enhance perceptual quality and structural detail in image generation using a novel GAN framework.
Proposed Latent Diffusion Generative Adversarial Networks (LD-GAN) to integrate adversarial learning into diffusion models.
Utilized a pretrained variational autoencoder (VAE) as an energy-based discriminator without adding parameters.
Introduced a structural consistency energy technique to align encoder and decoder representations.
LD-GAN significantly improves sample fidelity and perceptual sharpness compared to state-of-the-art models.
Enhanced diversity in image generation tasks was observed across various datasets.
Efficient training dynamics were maintained throughout the experiments.

Abstract

Abstract Diffusion models are powerful generative frameworks for producing high‐quality images by denoising latent variables from random noise. However, training with likelihood‐based objectives, such as denoising score matching, can lead to locally oversmoothed high‐frequency details, including fine textures and sharp edges, thereby limiting perceptual fidelity and structural detail. Adversarial training with GANs enhances sharpness but typically requires additional discriminator networks, increasing computational costs and destabilizing training. To this end, we propose Latent Diffusion Generative Adversarial Networks (LD‐GAN), a novel framework that seamlessly integrates adversarial learning into diffusion models without modifying their original pipeline. LD‐GAN leverages the pretrained variational autoencoder (VAE) in latent diffusion models as an energy‐based discriminator, enabling adversarial training without extra parameters and preserving the structured latent priors learned from large datasets. We also introduce a structural consistency energy that aligns encoder and decoder feature representations, thereby enhancing perceptual quality and compatibility with the pretrained latent space. Extensive experiments demonstrate that LD‐GAN significantly improves sample fidelity, perceptual sharpness, and diversity over state‐of‐the‐art baseline methods across various generation tasks while ensuring efficient training dynamics.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

U-Chae Jun

Sookmyung Women's University

Jaeeun Ko

Jiwoo Kang

Sookmyung Women's University

Journals

Computer Graphics Forum

Actions

Institutions

Sookmyung Women's University

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Latent Diffusion‐GAN: Adversarial Learning in the Autoencoded Latent Space

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study