Diffusion models produce high-fidelity images, yet adapting large pre-trained models to specialized domains with limited data remains challenging. We address this problem for microstructure generation by adapting a latent diffusion transformer (DiT) with a parameter-efficient fine-tuning (PEFT) strategy that updates only class embeddings, bias terms, layer normalization, and attention/feed-forward weights while freezing all remaining parameters. Compared with full fine-tuning (FFT), PEFT reduces training time and storage requirements by 16% and 17%, respectively, while achieving comparable or better image quality and diversity. We also introduce a structure-preserving overlap-crop (OC) augmentation that maintains spatial continuity across patch boundaries, reducing Fréchet Inception Distance (FID) by 26% over the conventional random-crop baseline. On the primary Ultrahigh Carbon Steel (UHCS) dataset, the generated micrographs achieve the lowest macro-average errors on domain-specific statistical descriptors (two-point correlation and lineal-path functions), confirming high morphological fidelity. The combined PEFT + OC pipeline is further validated on two additional datasets of increasing domain distance, Aachen-Heerlen (near-domain) and NFFA-Europe (far-domain), using a fixed training protocol without dataset-specific hyperparameter tuning. These results demonstrate that a structure-preserving augmentation, combined with parameter-efficient adaptation of a pre-trained DiT, provides a practical and transferable approach for synthetic microstructure generation under data-scarce conditions.
Phan et al. (Fri,) studied this question.