• Proposed a hierarchical matching strategy. By matching detection targets and trajectories within a certain range, the probability of erroneous matching for targets that are far away in space is reduced. A regional spatial matrix is proposed to solve the detection frame allocation problem when multiple targets overlap. • The VD-IOU method is proposed. By using the spatial position information of the target detection frame and related trajectories to calculate the spatial intersection and union ratio between the two, VD-IOU serves as a powerful basis for target association. • Effectively integrating the spatial hierarchical matching strategy and the VD-IOU method, our proposed SH-Tracker effectively improves the accuracy of target tracking and reaches the SOTA level. Multi-object tracking (MOT) is pivotal for applications like autonomous driving and surveillance, yet persistent challenges such as identity switches under occlusions and target overlaps remain unsolved. Existing methods primarily rely on 2D positional data, limiting their robustness in crowded scenes. To address this, we propose SH-Tracker, a novel framework integrating a Hierarchical Matching Strategy (HMS) and Virtual Depth Intersection-over-Union (VD-IOU). HMS reduces mismatches by constraining target associations within dynamic spatial ranges, while VD-IOU enriches 2D detections with virtual depth to model 3D spatial relationships. This combination is novel in that it achieves pseudo-3D modeling without requiring depth sensors, providing a new perspective for camera-only MOT. Evaluated on MOT16, MOT17, MOT20, and DanceTrack benchmarks, SH-Tracker achieves state-of-the-art performance, e.g., ranking 1st on MOT16 (64.4% HOTA, 79.3% MOTA) and MOT17 (64.0% HOTA), and significantly boosting performance on DanceTrack (58.8% HOTA, +3.7% over the baseline). The method demonstrates remarkable improvements in association accuracy (AssA), particularly in scenarios with nonlinear motion and high occlusions. We believe SH-Tracker provides a robust and practical solution for complex MOT scenarios.
Jia et al. (Sun,) studied this question.