What question did this study set out to answer?

The aim is to improve image generation quality in denoising diffusion models under class-imbalanced training data.

May 9, 2026Open Access

PD-CBDM: Training Class-Balancing Diffusion Models with Perceptual Distinguish Loss

Key Points

The aim is to improve image generation quality in denoising diffusion models under class-imbalanced training data.
Introduced PD-CBDM to adjust target-label distribution for better sampling of tail classes.
Implemented perceptual distinguish loss to increase Kullback-Leibler divergence between head and tail classes.
Designed a timestep-dependent Self-Attention module to improve noise estimation during image generation.
FID improved from 5.81 to 4.96 on CIFAR100-LT, indicating better image quality.
FID improved from 5.46 to 5.03 on CIFAR10-LT, showing effectiveness across datasets.
PD-CBDM's performance is competitive with recent methods BPA and NoisyTwins.

Abstract

For image generation, denoising diffusion probabilistic models (DDPMs) have shown strong performance. Nevertheless, under class-imbalanced training data, many existing models tend to overfit head classes, which degrades image quality for tail classes. To mitigate this issue, we propose a new generation method, PD-CBDM (perceptual distinguish loss–class-balancing diffusion models). As a first step, PD-CBDM revises the target-label distribution used for label sampling in the baseline pipeline, so tail classes are sampled more frequently during training; this improves the diversity of generated images while keeping fidelity high. Next, we introduce a perceptual distinguish loss that enlarges the separation (measured by the KL divergence in the reverse process) between the data distributions of head and tail classes, which helps suppress head-class overfitting and improves generation quality across classes. Additionally, we propose a timestep-dependent Self-Attention (TSA) module that injects timestep cues into the self-attention mechanism to model temporal and spatial dependencies together, thereby enhancing noise estimation accuracy and image generation quality. Experiments show that PD-CBDM improves FID from 5.81 to 4.96 on CIFAR100-LT and from 5.46 to 5.03 on CIFAR10-LT, and it is competitive with representative recent methods such as BPA and NoisyTwins.

Read Full Paperexternally

Demander à l'IA

Bookmark

View Full Paper

Cite This Study

Hu et al. (Thu,) studied this question.

synapsesocial.com/papers/69fecfafb9154b0b82876ac9 https://doi.org/https://doi.org/10.3390/math14101576

Demander à l'IA

Bookmark

View Full Paper