What does this research mean for the field?

The proposed Graph Aggregation Alignment Network (GAANet) achieves state-of-the-art multispectral object detection accuracy across multiple datasets while reducing model size by 61.2% compared to representative baselines. Novelty: ClaimNovelty.METHODOLOGICAL. Consensus alignment: ConsensusAlignment.NEUTRAL.

July 25, 2025

GAANet: Graph Aggregation Alignment Feature Fusion for Multispectral Object Detection

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

Multispectral object detection has shown great promise in security and industrial applications. RGB images offer rich texture but are limited by lighting, whereas IR images excel in low light but lack texture. Current methods face challenges in accurately capturing information differences and achieving effective feature fusion across modalities. To address these issues, we propose a graph aggregation alignment network (GAANet) for multispectral object detection. GAANet consists of two key modules: the graph interaction fusion module (GIFM) and the information alignment module (IAM). GIFM uses graph representation learning to effectively process single-modality features, and the direct connection information flow mechanism guides and references low-level multimodal features, ensuring the global and comprehensive fusion of node information in the graph space. The results are then refined through the IAM for secondary calibration and alignment of corresponding local regions, ensuring accurate fusion. We also introduce an information reconstruction path (IRP) and reconstruction loss to prevent the loss of single-modality information due to multiple IAM calculations. GAANet achieves excellent fusion detection capability and significantly reduces the number of parameters, reducing the model size by 61.2% compared with that of representative baselines such as CALNet. GAANet achieves state-of-the-art results on the DroneVehicle, LLVIP, and FLIR datasets, with superior object detection accuracy. It also performs well on the unaligned DVTOD dataset, effectively capturing feature offsets across modalities through global graph perception.

Me gusta

Guardar

Me gusta

Guardar

GAANet: Graph Aggregation Alignment Feature Fusion for Multispectral Object Detection

Puntos clave

Resumen

Cite This Study