Data augmentation is a cornerstone technique for improving the generalization of deep learning classifiers, yet classical geometric transformations such as random flipping, cropping, and color jittering offer limited semantic diversity. The emergence of latent diffusion models, and Stable Diffusion in particular, introduces a paradigm shift: high fidelity, semantically rich synthetic images can now be generated on demand for arbitrary target categories. This survey provides a comprehensive review of Stable Diffusion based data augmentation strategies and their impact on image classification performance across diverse domains including natural imagery, medical imaging, and fine-grained recognition. We analyze over thirty recent works spanning benchmark datasets such as CIFAR-10/100, ImageNet, ChestXray14, and ISIC, and evaluate augmentation outcomes across classifier architectures including ResNet, EfficientNet, and Vision Transformers. Our synthesis reveals that Stable Diffusion based augmentation consistently outperforms traditional geometric and GAN-based methods, yielding accuracy gains of 2– 5% on standard benchmarks and up to 7% on data-scarce medical datasets. We further identify the critical role of prompt engineering, fine-tuning strategies such as DreamBooth and ControlNet, and hybrid augmentation pipelines in maximizing classifier performance. We conclude with an analysis of current challenges and future research directions.
Goel et al. (Thu,) studied this question.