What question did this study set out to answer?

The central aim is to improve pedestrian detection in challenging lighting conditions using infrared and visible images.

December 8, 2025Open Access

Infrared–Visible Fusion via Cross-Modality Attention and Small-Object Enhancement for Pedestrian Detection

Key Points

The central aim is to improve pedestrian detection in challenging lighting conditions using infrared and visible images.
Developed IVIFusion framework for fusing infrared and visible images at the feature level
Utilized a dual-branch Transformer-based backbone for modality-specific feature extraction
Introduced Cross-Modality Attention Fusion Module to enhance cross-modal representations and suppress noise
Incorporated a small-object detection layer to improve recall for distant and occluded pedestrians
Achieved mAP0.5 scores of 98.6% on LLVIP dataset and 97.2% on HGPD dataset
Demonstrated superior performance in low illumination and complex environments
Maintained real-time efficiency and low computational cost

Abstract

Pedestrian detection under low illumination and complex environments remains a significant challenge for vision-based systems, particularly in safety-critical applications such as urban rail transit. To address the limitations of single-modality detection in adverse conditions, this paper proposes IVIFusion, a lightweight yet robust pedestrian detection framework that fuses infrared and visible images at the feature level. The method integrates a dual-branch Transformer-based backbone for modality-specific feature extraction and introduces a Cross-Modality Attention Fusion Module (CMAFM) to adaptively enhance cross-modal representations while suppressing noise. Furthermore, a dedicated small-object detection layer is incorporated to improve the recall of distant and occluded pedestrians. Extensive experiments conducted on the public LLVIP dataset and the custom HGPD dataset demonstrate the superior performance of IVIFusion, achieving mAP0.5 scores of 98.6% and 97.2%, respectively. The results validate the effectiveness of the proposed architecture in handling challenging lighting conditions while maintaining real-time efficiency and low computational cost.

Read Full Paperexternally

AIに質問

Bookmark

View Full Paper

Cite This Study

Yang et al. (Tue,) studied this question.

synapsesocial.com/papers/693624c34fa91c937236ccb8 https://doi.org/https://doi.org/10.3390/ijgi14120477

AIに質問

Bookmark

View Full Paper