Key points are not available for this paper at this time.
Due to the complementarity of RGB and thermal (RGBT) modalities, RGBT trackers have become a solution for visual tracking in variable and complex scenarios. Most existing works focus solely on exploiting collaborative representations that are fused from the two modalities. Although these methods effectively deploy information fusion between multiple modalities, they neglect the potential value of specific representations for each modality. In addition, these works suffer from poor tracking efficiency, resulting in limited tracker utility. In this paper, a novel Siamese tracker synergizing specific and collaborative representations (SiamSCR) is proposed for real-time RGBT tracking. Specifically, a new cross-layer feature aggregation module is built to facilitate the interaction between deep features from different layers. This enables the obtaining of specific representations for the two modalities. Next, to better leverage the complementary information between RGB and Thermal modalities, a Global Cross-Attention Fusion module is designed to obtain collaborative representations. Finally, both specific and collaborative representations are simultaneously fed into the classification-regression network, where three different types of information collaborate to complete bounding box prediction. Extensive experiments on three large-scale RGBT benchmarks demonstrate outstanding tracking capabilities over other state-of-the-art trackers, with tracking speeds exceeding 37.2 FPS.
Building similarity graph...
Analyzing shared references across papers
Loading...
Yisong Liu
Dongming Zhou
Jinde Cao
IEEE Sensors Journal
Southeast University
Yunnan University
Ahlia University
Building similarity graph...
Analyzing shared references across papers
Loading...
Liu et al. (Mon,) studied this question.
www.synapsesocial.com/papers/68e6f047b6db64358766adf8 — DOI: https://doi.org/10.1109/jsen.2024.3386772