What question did this study set out to answer?

The survey aims to explore how Stable Diffusion-based data augmentation enhances image classification performance across different domains.

May 9, 2026Open Access

Generative Augmentation for Image Classification: A Survey on Stable Diffusion-Based Approaches

Key Points

The survey aims to explore how Stable Diffusion-based data augmentation enhances image classification performance across different domains.
Reviewed over thirty recent works on data augmentation using Stable Diffusion.
Analyzed augmentation outcomes on datasets like CIFAR-10, ImageNet, ChestXray14, and ISIC.
Evaluated classifier architectures such as ResNet, EfficientNet, and Vision Transformers.
Stable Diffusion-based augmentation leads to accuracy gains of 2-5% on standard benchmarks.
Achieved up to 7% accuracy improvement on data-scarce medical datasets.
Identified prompt engineering and fine-tuning strategies as critical for maximizing classifier performance.

Abstract

Data augmentation is a cornerstone technique for improving the generalization of deep learning classifiers, yet classical geometric transformations such as random flipping, cropping, and color jittering offer limited semantic diversity. The emergence of latent diffusion models, and Stable Diffusion in particular, introduces a paradigm shift: high fidelity, semantically rich synthetic images can now be generated on demand for arbitrary target categories. This survey provides a comprehensive review of Stable Diffusion based data augmentation strategies and their impact on image classification performance across diverse domains including natural imagery, medical imaging, and fine-grained recognition. We analyze over thirty recent works spanning benchmark datasets such as CIFAR-10/100, ImageNet, ChestXray14, and ISIC, and evaluate augmentation outcomes across classifier architectures including ResNet, EfficientNet, and Vision Transformers. Our synthesis reveals that Stable Diffusion based augmentation consistently outperforms traditional geometric and GAN-based methods, yielding accuracy gains of 2– 5% on standard benchmarks and up to 7% on data-scarce medical datasets. We further identify the critical role of prompt engineering, fine-tuning strategies such as DreamBooth and ControlNet, and hybrid augmentation pipelines in maximizing classifier performance. We conclude with an analysis of current challenges and future research directions.

Read Full Paperexternally

Mark Helpful

Bookmark

Relay

View Full Paper