Visible-infrared image fusion is crucial for applications like autonomous driving and nighttime surveillance, yet it remains challenging due to the inherent limitations of existing deep learning models. Convolutional Neural Networks (CNNs) are constrained by their local receptive fields, while Transformers suffer from quadratic computational complexity. To address these issues, this paper investigates the application of the Mamba model—a novel State Space Model (SSM) with linear-complexity global modeling and selective scanning capabilities—to the task of visible-infrared image fusion. Building upon Mamba, we propose a novel fusion framework featuring two key designs: (1) A Multi-Path Mamba (MPMamba) module that orchestrates parallel Mamba blocks with convolutional streams to extract multi-scale, modality-specific features; and (2) a Dual-path Mamba Attention Fusion (DMAF) module that explicitly decouples and processes shared and complementary features via dual Mamba paths, followed by dynamic calibration with a Convolutional Block Attention Module (CBAM). Extensive experiments on the MSRS benchmark demonstrate that our framework achieves state-of-the-art performance, outperforming strong baselines such as U2Fusion and SwinFusion across key metrics including Information Entropy (EN), Spatial Frequency (SF), Mutual Information (MI), and edge-based fusion quality (Qabf). Visual results confirm its ability to produce fused images that saliently preserve thermal targets while retaining rich texture details.
Building similarity graph...
Analyzing shared references across papers
Loading...
Jinsong He
Jianghua Cheng
Tong Liu
Remote Sensing
National University of Defense Technology
Building similarity graph...
Analyzing shared references across papers
Loading...
He et al. (Wed,) studied this question.
www.synapsesocial.com/papers/6997fa12ad1d9b11b345311d — DOI: https://doi.org/10.3390/rs18040636