What question did this study set out to answer?

The research aims to enhance visual tracking capabilities in autonomous systems for various applications.

March 26, 2026Open Access

Visual Object Tracking for Autonomous Systems and Other Applications

Key Points

The research aims to enhance visual tracking capabilities in autonomous systems for various applications.
Developed an active vision method for improved computer vision performance.
Introduced a geometric modeling framework for UAV visual tracking.
Created a 2D visual tracking framework to handle occlusions and motion challenges.
Proposed an adversarial learning approach for tracker precision.
Implemented a Robust Tracking Module to mitigate performance degradation from noise.
The new tracking methods demonstrate improved performance in dynamic environments.
The UAV framework enables intelligent cinematographic planning.
The 2D framework allows for recovery from occlusions without re-initialization.
The adversarial approach enhances model precision while being lightweight.
The Robust Tracking Module shows effectiveness in standardizing input conditions against visual distortions.

Abstract

Autonomous systems rely heavily on visual perception to operate reliably in dynamic and unpredictable environments, gradually transitioning from passive observation to active interaction, making robust and adaptive visual tracking essential for real-time performance. This thesis provides a series of contributions that strengthen the tracking capabilities of such systems, both in specialized contexts like aerial cinematography and in general-purpose autonomous applications. A novel active vision method to improve performance of computer vision tasks is also provided. A study focusing on Unmanned Aerial Vehicles (UAVs) presents a geometric modeling framework that relates desired shot types, UAV/camera trajectories, and camera focal length constraints to ensure reliable visual tracking, enabling intelligent on-the-fly cinematographic planning. Extending beyond UAVs, a long-term 2D visual tracking framework is developed to handle common challenges such as occlusions, fast motion, and temporary target disappearance, allowing recovery without tracker re-initialization by dynamically adjusting the tracking model based on occlusion severity. To further enhance adaptability, an adversarial learning approach is proposed where the tracker functions as a generator guided by a discriminator that evaluates response map consistency with a target distribution, improving model precision while remaining lightweight enough for embedded systems. Additionally, a Robust Tracking Module (RTM) is introduced to increase resilience against input noise by applying image-to-image translation, standardizing input conditions and mitigating performance degradation under visual distortions. The effectiveness of this module is validated through an evaluation toolkit designed to benchmark tracking robustness across different noise types. Finally, a hierarchical reward based Reinforcement Learning (RL) framework is proposed that allows robotic systems to learn motion policies to optimize the performance of computer vision methods such as optical character recognition and face recognition. Together, these contributions deliver a comprehensive vision framework that improves the stability, adaptability, and reliability of visual tracking, with broad applicability across domains such as robotics and autonomous systems, surveillance, and smart vehicles, while retaining special relevance to the challenges of UAV-based autonomous cinematography.

Bookmark

View Full Paper

Bookmark

View Full Paper

Visual Object Tracking for Autonomous Systems and Other Applications

Key Points

Abstract

Cite This Study