Diffusion probabilistic models have demonstrated remarkable superiority in SISR. Yet, their multi-step denoising mechanism incurs prohibitive computational overhead, which severely limits real-world deployment. To address this issue, we propose an Entropy Subtraction-Supported Diffusion Denoising framework for image Reconstruction (ESRDF). The core idea is to shift part of the SR burden from the diffusion model to an image Decoder, with a key focus on recovering the symmetric structural correspondence between LR and HR images that is often degraded during downsampling. Specifically, ESRDF’s main branch employs a CNN that performs one-step feature reconstruction, supervised by a novel entropy-matching loss in addition to the conventional reconstruction loss. This loss adopts a patch-wise entropy matching strategy that enforces regional consistency between the True and the predicted images. Building on L1’s focus on pixel-level details and perceptual loss’s grasp of global semantics, region-wise entropy measurement further completes the global alignment of intra-region information structures. Under this framework, the main branch delivers coarse low-frequency content, drastically reducing the workload of the diffusion branch, which now only needs to sparsely refine high-frequency details. Experimental results on multiple benchmark datasets demonstrate that ESRDF achieves shorter model convergence times and higher generation quality with fewer denoising steps, outperforming previous diffusion-based image reconstruction methods.
Huang et al. (Tue,) studied this question.