Diffusion models have shown strong promise for image super-resolution (ISR). However, current approaches often underuse pretrained diffusion backbones and lack constraints on the sampling trajectory, which degrades structural consistency and fine details. For that, we introduce the trajectory consistent diffusion model (TCDM) for super-resolution, which jointly optimizes the sampling process through lightweight components and inference-time strategies while keeping the diffusion backbone frozen, yielding high-fidelity, detail-rich reconstructions. First, we propose a dynamic semantic selection (DSS) mechanism that records early intermediates, matches them to upsampled low-resolution features, and reconditions sampling with the best match to reduce the mismatch between conditioning and noise scale. Next, we design a cross-step aggregation guidance (CAG) strategy that aggregates features from the current state with the selected intermediate to enforce trajectory-level consistency in noise prediction. Finally, we present a plug-and-play frequency enhancement adapter (FE-Adapter) that injects different frequency-domain cues into the encoder during training, strengthening high-frequency perception while preserving global structures. Extensive experiments on multiple ISR benchmarks show that TCDM achieves strong structural fidelity and competitive no-reference perceptual quality, offering a favorable fidelity-perception trade-off.
Huang et al. (Thu,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: