Key points are not available for this paper at this time.
Visible-infrared person re-identification (VI-ReID) is a challenging cross-modality retrieval task due to the large modality gap. While numerous efforts have been devoted to the supervised setting with a large amount of labeled cross-modality correspondences, few studies have tried to mitigate the modality gap by mining cross-modality correspondences in an unsupervised manner. However, existing works failed to capture the intrinsic relations among samples across two modalities, resulting in limited performance outcomes. In this paper, we propose a novel Progressive Graph Matching (PGM) approach to globally model the cross-modality relationships and instance-level affinities. PGM formulates cross-modality correspondence mining as a graph matching procedure, aiming to integrate global information by minimizing global matching costs. Considering that samples in wrong clusters cannot find reliable cross-modality correspondences by PGM, we further introduce a robust Dual-Level Matching (DLM) mechanism, combining the cluster-level PGM and Nearest Instance-Cluster Searching (NICS) with instance-level affinity optimization. Additionally, we design an Outlier Filter Strategy (OFS) to filter out unreliable cross-modality correspondences based on the dual-level relation constraints. To mitigate false accumulation in cross-modal correspondence learning, an Alternate Cross Contrastive Learning (ACCL) module is proposed to alternately adjust the dominated matching, i.e., visible-to-infrared or infrared-to-visible matching. Empirical results demonstrate the superiority of our unsupervised solution, achieving comparable performance with supervised counterparts.
Ye et al. (Tue,) studied this question.