We present a novel 360° panoramic video conferencing system that dynamically synchronizes virtual backgrounds with real-time camera motion, addressing the limitations of static backgrounds in conventional systems. By integrating robust human segmentation, monocular visual odometry (VO), and virtual environment rendering, our method achieves seamless alignment between foreground participants and immersive 3D virtual scenes. Unlike prior approaches that suffer from foreground-background desynchronization during camera rotations or user movements, our framework estimates camera rotation in 3-DoF using a hybrid pipeline combining feature-based patch tracking and pose smoothing, while ignoring translation artifacts to maintain stability. This work bridges the gap between computational efficiency and MR-driven telepresence, offering a practical solution for next-generation virtual collaboration.
Xu et al. (Wed,) studied this question.