Rear-view human tracking and re-identification remain critical challenges for robotic visual sensing in unmanned vehicles, particularly under adverse weather conditions and severe occlusion. Conventional deep learning models often suffer from feature contamination and trajectory drift under dynamic illumination. To overcome these bottlenecks, we propose a lightweight tracking framework driven by spatiotemporal prediction and multimodal feature fusion. Specifically, an ego-motion-aware Kalman prediction mechanism maintains temporal continuity during complete occlusions. Upon target reappearance, a multi-factor descriptor—fusing color histograms with geometric constraints—is employed within a dynamic Mahalanobis search region. This is coupled with a specular-reflection-penalized adaptive learning rate (ηk) that actively freezes template updates during severe environmental degradation conditions. Evaluated on a custom Mecanum-wheeled robot, the proposed method achieves a peak precision of 94.2% and a tracking success rate of 93.4%. Extensive experiments in extreme rainy night scenarios demonstrate a 35% reduction in average tracking error, maintaining a Center Location Error (CLE) below 11 pixels. Furthermore, the system achieves a rapid target re-identification response of 72.83 ms during occlusion phases. Ultimately, this framework delivers a highly robust and real-time solution for autonomous navigation in complex dynamic environments.
Jia et al. (Tue,) studied this question.