What question did this study set out to answer?

This research aims to develop a direct initialization method that enhances stereo visual-inertial odometry by avoiding traditional feature tracking errors.

April 13, 2026Open Access

Direct Sparse Initialization for Stereo Visual-Inertial Odometry

Key Points

This research aims to develop a direct initialization method that enhances stereo visual-inertial odometry by avoiding traditional feature tracking errors.
Introduced a direct initialization method linking image intensities to initial parameters.
Developed a prediction function to compute corresponding points from initial parameters.
Formulated an objective function to minimize photometric error using sparse points.
Proposed an approximation method for two-frame initialization.
Achieved superior estimation accuracy and initialization success rate compared to existing methods.
Demonstrated effective performance even with minimal frame data (3 frames).
Outperformed state-of-the-art techniques that required 10 frames in various metrics.

Abstract

Existing stereo visual-inertial initialization methods, whether tightly or loosely coupled, rely critically on intermediate variables like feature correspondences and camera poses rather than original image data. Computing these variables through feature tracking and Structure-from-Motion (SfM) inherently introduces errors, adversely affecting results. To overcome this limitation, we propose a direct initialization method for stereo visual-inertial odometry, which directly bridges original image intensities and initial parameters, bypassing conventional intermediate variables. Specifically, we introduce a prediction function to compute the corresponding points from the initial parameters. Then we formulate an objective function that optimizes initial parameters by minimizing the photometric error of sparse points, eliminating the need for feature tracking and SfM. The metric scale in our initialization is directly determined by the stereo baseline. We further propose an approximation method for two-frame initialization, demonstrating its efficacy even with minimal frame data. Extensive experiments confirm that our method achieves superior performance in both estimation accuracy and initialization success rate with shorter runtime. Even with 3 frames for initialization, our method outperforms the state-of-the-art methods using 10 frames in most metrics.

Direct Sparse Initialization for Stereo Visual-Inertial Odometry

Key Points

Abstract

Cite This Study