What question did this study set out to answer?

The research aims to develop an advanced framework for autonomous navigation using multi-modal belief state representation to handle partial observability.

April 15, 2026Open Access

Deep Reinforcement Learning for Navigation via Multi-Modal Belief State Representation from LiDAR and Depth Sensors

Key Points

The research aims to develop an advanced framework for autonomous navigation using multi-modal belief state representation to handle partial observability.
Utilized deep reinforcement learning to construct a navigation framework.
Developed a probabilistic representation module for modeling belief states as Gaussian distributions.
Employed sensor-specific encoders to extract features from LiDAR and depth sensor data.
Integrated the representation into a Soft Actor–Critic framework for decision-making.
Demonstrated improved success rate and navigation efficiency in simulated environments.
Validated real-world improvements with a 16% reduction in average travel time.
Achieved a 4% decrease in path length compared to classical navigation methods.

Abstract

This paper presents a deep reinforcement learning framework for autonomous navigation based on multi-modal belief state representation learned from LiDAR and depth sensors. To address the challenges posed by partial observability and sensor-specific uncertainty, we propose a probabilistic representation module that models belief states as Gaussian distributions over latent environmental features. Sensor-specific encoders extract structured features from raw LiDAR and depth inputs, which are fused using a Q-value-guided weighting scheme derived from the policy critic. A motion-prediction pretraining strategy and a cross-modal coherence loss are introduced to enhance the alignment and reliability of the learned belief states. The resulting representation is integrated into a Soft Actor–Critic (SAC) framework to enable policy-driven decision-making under uncertainty. Extensive experiments in simulated environments demonstrate that the proposed method improves success rate, navigation efficiency, and generalization. Real-world experiments further validate these findings, with the proposed method outperforming a classical navigation baseline by reducing average travel time by 16% and path length by 4%. These results support the use of probabilistic multi-modal belief modeling for autonomous navigation under partial observability.

Deep Reinforcement Learning for Navigation via Multi-Modal Belief State Representation from LiDAR and Depth Sensors

Key Points

Abstract

Cite This Study