ProFreqDiff technical report / preprint. Diffusion models for image synthesis often exhibit coarse-to-fine denoising behavior: large-scale structure is typically recovered more robustly under high noise, while fine texture and edges emerge under lower noise conditions ho2020,nichol2021,karras2022. This behavior is especially important in high-resolution synthesis, where perceptual quality depends strongly on high-frequency fidelity. Standard diffusion training, however, is largely frequency-agnostic: the forward corruption process is global, timestep sampling is usually content-independent, and supervision is commonly applied only at the final output. This paper presents a unified training framework for progressive frequency-aware diffusion in latent space. The framework combines four components: (1) a spectral curriculum loss that shifts auxiliary reconstruction emphasis from lower to higher frequency bands over training, (2) an adaptive per-sample log-SNR shift based on a fixed input-spectrum difficulty score, (3) lightweight multi-scale latent frequency heads used only during training, and (4) shared-weight progressive-resolution training. The design goal is to align optimization more closely with coarse-to-fine spectral learning while preserving a standard inference-time sampling pipeline. To make the paper self-contained and empirically grounded, we include controlled simulation experiments that isolate the proposed mechanisms. These simulations do not replace full large-scale benchmarks, but they do provide exact quantitative evidence for the methods training dynamics and expected behavior. Results show that the spectral curriculum improves late-band reconstruction, adaptive log-SNR shifting reduces error for spectrally difficult samples, and the combined method yields the best overall reconstruction and spectral balance in the simulated setting. Existing OSF archival DOI: 10.17605/OSF.IO/2CVGQ; Existing OSF archival page: https://osf.io/2cvgq/. Files include the technical report PDF and the LaTeX source tarball when available.
Haopeng Jin (Mon,) studied this question.