What question did this study set out to answer?

This study aims to develop a multi-modal GAN that integrates MRI and CT images for improved analysis and anomaly detection while ensuring interpretability and sustainability.

June 29, 2026Open Access

SE-MMFusionGAN: A Sustainable and Explainable Multi-modal Fusion GAN for MRI–CT Image Fusion and Anomaly Localization

Key Points

This study aims to develop a multi-modal GAN that integrates MRI and CT images for improved analysis and anomaly detection while ensuring interpretability and sustainability.
Proposed SE-MMFusionGAN framework integrates MRI and CT modalities using generative adversarial networks.
Employs cross-modality attention mechanism for feature preservation and an activation-efficiency regularization module to reduce energy use.
Evaluated on multiple benchmark datasets, including BraTS 2024 and CT–MRI paired data, assessing metrics like PSNR and SSIM.
SE-MMFusionGAN achieved improved PSNR, SSIM, Dice, and IoU scores compared to existing fusion methods (exact metrics not detailed).
Significantly reduced energy consumption during image processing operations.
Provided interpretable heatmaps to enhance the explainability of the fused images.

Abstract

Accurate and interpretable analysis of medical images requires the integration of complementary modalities such as Magnetic Resonance Imaging (MRI) and Computed Tomography (CT). This paper proposes SE-MMFusionGAN , a Sustainable and Explainable Multi-Modal Fusion Generative Adversarial Network that unifies interpretability, efficiency, and fusion quality within a single framework. Dual modality-specific encoders extract rich structural and textural representations, which are fused through a Cross-Modality Attention (CMA) mechanism to preserve clinically relevant features. An Activation-Efficiency Regularization module minimizes redundant activations, reducing computational and energy overhead, while a gradient-based attribution mechanism provides modality-aware explainability through interpretable heatmaps. The resulting fused representation enhances both visual interpretability for clinicians and downstream analysis, including anomaly localization. Extensive evaluations on benchmark datasets, including BraTS 2024, CT–MRI paired data, and BMAD, demonstrate that SE-MMFusionGAN achieves improved PSNR, SSIM, Dice, and IoU scores compared to state-of-the-art fusion networks, along with significant reduction in energy consumption. The framework thus establishes a balance between fidelity, interpretability, and sustainability, offering a practical step toward energy-aware and explainable multi-modal medical image fusion.

Mark Helpful

Bookmark

Relay

View Full Paper