What question did this study set out to answer?

To improve the speed and quality of clinical medical image fusion using an edge-aware diffusion model.

May 6, 2026Open Access

Accelerated Edge-Aware Diffusion Model with Spatial Refinement for Clinical Medical Image Fusion

Key Points

To improve the speed and quality of clinical medical image fusion using an edge-aware diffusion model.
Developed an accelerated edge-aware diffusion model with spatial refinement.
Utilized edge-enhanced data blocks and non-uniform time-step sampling.
Implemented a Nesterov accelerated alternating direction method for pixel-level corrections.
Achieved approximately 42% faster inference time compared to the baseline.
Demonstrated superior performance in image fidelity and structural preservation.
Effectively merged soft tissue textures and skeletal contours in medical images.

Abstract

Multimodal medical image fusion provides vital anatomical and pathological details for clinical diagnosis. However, existing diffusion algorithms often struggle with prolonged inference times and local structure loss. To address these critical issues in applied medical imaging, we propose an accelerated edge-aware diffusion model with spatial refinement. This framework utilizes a coarse-to-fine collaborative architecture. It first extracts structural priors via edge-enhanced data blocks and a non-uniform time-step accelerated sampling strategy. During refinement, a spatially adaptive non-convex variational module employs a Nesterov accelerated alternating direction method of multipliers for pixel-level correction to efficiently remove diffusion artifacts and sharpen anatomical boundaries. We conduct extensive comparative experiments against the vanilla diffusion baseline and state-of-the-art deep learning paradigms. Qualitative and quantitative evaluations on clinical datasets demonstrate the superior balanced performance of our model. The framework delivers highly natural visual representations, effectively merging sharp skeletal contours from computed tomography with rich soft tissue textures from magnetic resonance imaging while preventing unnatural over-sharpening. Additionally, it demonstrates outstanding performance across comprehensive statistical metrics, reflecting exceptional image fidelity, robust global contrast, and precise structural preservation. Furthermore, the model reduces inference time by approximately 42% compared to the baseline. Ultimately, this framework strikes an optimal balance between superior image fusion quality and computational efficiency, offering enhanced visual representations with potential utility for clinical image processing under limited resources.

Read Full Paperexternally

Mark Helpful

Bookmark

Relay

View Full Paper