What question did this study set out to answer?

The study aims to improve the accuracy of multimodal medical image registration by addressing modality differences using an unsupervised learning approach.

June 1, 2026

Unsupervised multimodal deformable medical image registration based on feature perceptual contrast learning

Key Points

The study aims to improve the accuracy of multimodal medical image registration by addressing modality differences using an unsupervised learning approach.
Utilized a Feature Perceptual Contrast Learning Network (FP-net) to learn modality-descriptor representations.
Applied unsupervised registration framework that avoids using ground-truth deformation fields.
Evaluated the method on BraTS 2021 and Learn2Reg 2021 datasets for various image registration tasks.
Achieved a Dice Similarity Coefficient (DSC) of 76.3% in T2-T1 and 77.7% in T1-T1ce registration tasks.
In abdominal CT-MR registration, obtained a DSC of 50.1%, showing significant structural alignment improvement.
Demonstrated superior performance against state-of-the-art registration methods.

Abstract

OBJECTIVE: Multimodal medical image registration has extensive applications in clinical diagnosis and is fundamental for a series of medical analysis tasks. However, the presence of modality differences makes the registration process challenging. Existing methods often employ modality-independent feature descriptors that are sensitive to noise, or attempt to bridge differences within networks, which typically results in translation inaccuracies and misaligned anatomical information. Approach. In this paper, we propose a novel unsupervised approach utilizing a Feature Perceptual Contrast Learning Network (FP-net) to learn descriptors that bridge modality differences while accurately capturing common details. We unify the feature representation of anatomical information under homogeneous and heterogeneous intensity distributions through local sampling-based feature perceptual contrast learning and image reconstruction learning. The trained FP-net is subsequently employed to drive an unsupervised registration framework without requiring ground-truth deformation fields. Main results. We extensively evaluated our method on two public benchmarks: the BraTS 2021 dataset for brain T2-T1 and T1-T1ce registration, and the Learn2Reg 2021 dataset for challenging abdominal CT-MR registration. By passing multimodal image pairs with shape differences through the fixed FP-net, we generate optimization gradients that successfully update the registration network. Quantitative evaluations demonstrate our method's superiority over state-of-the-art baselines. Specifically, our model achieved a Dice Similarity Coefficient (DSC) of 76.3\% and 77.7\% in tumor-bearing T2-T1 and T1-T1ce tasks, respectively. Furthermore, in the complex abdominal CT-MR task, it reached a DSC of 50.1\%, significantly improving structural alignment. Significance. Our method effectively shifts the burden of bridging the modality gap away from the registration network, enabling standard U-Net architectures to achieve state-of-the-art deformable registration. This provides a robust, accurate, and easily deployable unsupervised solution for complex clinical multimodal image analysis.

Bookmark

Unsupervised multimodal deformable medical image registration based on feature perceptual contrast learning

Key Points

Abstract

Cite This Study