December 4, 2025

From Motion to Localization: Cross-view Optimization with Stationary Event and RGB Cameras for Enhanced Pose Estimation

Puntos clave

Pose estimation accuracy enhances with a new stationary event and RGB camera setup, optimizing for augmented reality applications.
The approach demonstrates a 12.9% reduction in translation errors and a 13.4% reduction in rotation errors over four weeks.
Cross-view pose optimizer integrates 3D Structure from Motion with a dynamic map using RGB and event data.
Significant improvements in accuracy highlight the importance of addressing performance degradation in dynamic environments.

Resumen

Applications such as Augmented Reality (AR) require accurate device positioning to minimize alignment errors. While visual positioning techniques offer high accuracy, their performance can degrade due to environmental changes like lighting variations and object movements. This paper introduces a new approach to visual positioning, relying on a stationary joint event/RGB sensing platform to track scene dynamics in real-time. This platform is at the core of a localization pipeline to predict the pose of user devices. First, a cross-modal object tracker matches dynamic objects between RGB and event images captured by the platform. These objects contribute to building a dynamic map, combined with the initial static 3D Structure from Motion (SfM) model to form a global feature map. Finally, a cross-view pose optimizer estimates pose uncertainties between modalities to refine and improve localization accuracy. To validate our approach, we collect a large-scale dataset over three scenes to account for typical AR scenarios where dynamics can affect the quality of visual positioning. We contribute this dataset to the community for future research on scene dynamics. Our approach shows significant improvement over existing methods, reducing translation and rotation errors by 12.9% and 13.4%, respectively, for weekly data over 4 weeks, and by 38.5% and 16.2% for monthly data over 4 months, compared to HLoc (SP+SG). It also reduces performance degradation by up to 50% after only 4 weeks.

Preguntar a la IA

Me gusta

Guardar