What question did this study set out to answer?

This research aims to improve image super-resolution by enhancing structural consistency and detail in reconstructions using a new diffusion model.

June 13, 2026

Selection, Aggregation, and Enhancement: Trajectory Consistent Diffusion Model for Image Super-Resolution

Key Points

This research aims to improve image super-resolution by enhancing structural consistency and detail in reconstructions using a new diffusion model.
Introduced a trajectory consistent diffusion model (TCDM) optimizing the sampling process with lightweight components.
Developed a dynamic semantic selection (DSS) mechanism for matching low-resolution features with upsampled intermediates.
Implemented a frequency enhancement adapter (FE-Adapter) to inject frequency-domain cues into the model during training.
TCDM achieved superior structural fidelity compared to baseline models.
Experimentation showed improved no-reference perceptual quality metrics for the TCDM approach.
Overall findings indicate a favorable fidelity-perception trade-off, particularly in high-resolution image contexts.

Abstract

Diffusion models have shown strong promise for image super-resolution (ISR). However, current approaches often underuse pretrained diffusion backbones and lack constraints on the sampling trajectory, which degrades structural consistency and fine details. For that, we introduce the trajectory consistent diffusion model (TCDM) for super-resolution, which jointly optimizes the sampling process through lightweight components and inference-time strategies while keeping the diffusion backbone frozen, yielding high-fidelity, detail-rich reconstructions. First, we propose a dynamic semantic selection (DSS) mechanism that records early intermediates, matches them to upsampled low-resolution features, and reconditions sampling with the best match to reduce the mismatch between conditioning and noise scale. Next, we design a cross-step aggregation guidance (CAG) strategy that aggregates features from the current state with the selected intermediate to enforce trajectory-level consistency in noise prediction. Finally, we present a plug-and-play frequency enhancement adapter (FE-Adapter) that injects different frequency-domain cues into the encoder during training, strengthening high-frequency perception while preserving global structures. Extensive experiments on multiple ISR benchmarks show that TCDM achieves strong structural fidelity and competitive no-reference perceptual quality, offering a favorable fidelity-perception trade-off.

Bookmark

Selection, Aggregation, and Enhancement: Trajectory Consistent Diffusion Model for Image Super-Resolution

Key Points

Abstract

Cite This Study

Also Consider

Also Consider