As urbanization picks up pace and the public demand for security keeps climbing, video surveillance systems have emerged as a vital tool for maintaining social stability and safeguarding public safety. Person Re-Identification (Re-ID), as one of the core technologies in intelligent monitoring, mainly aims to accurately match pedestrian identities across cameras without overlapping fields of view. However, in practical applications, occlusion remains a primary challenge that severely degrades Re-ID performance. Especially in high-density crowds, pedestrians are often partially or completely obscured by other objects or individuals, resulting in incomplete image information and impaired feature representation, which significantly reduces recognition accuracy and reliability. Aiming at the problems of excessive reliance on external pose estimation models and asymmetric information matching in occluded Re-ID, this paper proposes a transformer-based pedestrian background decoupling network. The algorithm achieves foreground–background separation and multi-scale feature matching through the synergy of three modules. Meanwhile, a two-stage training strategy is adopted: the first stage optimizes the decoupling module to ensure clean feature separation, while the second stage jointly fine-tunes the correlation module to enhance matching accuracy. Extensive experimental results show that the proposed algorithm outperforms existing methods.
Li et al. (Sun,) studied this question.