Limited access to large-scale, task-specific navigation data remains a major obstacle to designing lightweight, real-time sensory-substitution devices (SSDs) for blind and low-vision travellers. Current datasets rarely contain collision examples or temporal context, forcing most AI pipelines to rely on generic scene-understanding tasks whose output is too high-dimensional for haptic or auditory feedback. We created NavIndoor , a Unity-based, procedurally-generated maze environment that delivers paired RGB and semantic-segmentation streams together with collision-aware rewards. A head-mounted blind “digital twin” was trained end-to-end with reinforcement learning to estimate a four-way action/value vector. Training used 250 episodes with a progressive increase in difficulty and semantic mask augmentation. Performance was (i) benchmarked against human players in-sim; (ii) probed on five unseen houses from the Active Vision Dataset (AVD) via linear classifiers that predict forward-path safety thresholds; (iii) timed on a Jetson Orin Nano powered with a portable power supply; (iv) evaluated on an early-stage haptic prototype during a proof-of-concept trial with a single blindfolded participant. In-simulation the best model attained a mean reward of 74.3 (73 % of human score). On the Active Vision Dataset (AVD) it achieved an AUC of 0.92 for detecting safe forward pathways and outperformed larger self-supervised or supervised backbones. The estimated value function V θ ( s ) correlated monotonically with the walkable distance, validating its interpretation as a one-dimensional safety signal. The full pipeline ran between 7 and 16 frames per second (FPS) on the Jetson with an end-to-end (image-to-motor) latency of 1.1 s. The model’s outputs were able to guide the blindfolded participant to navigate without collisions in a 38-meter curved corridor. Training on procedurally-generated semantic maps yields compact RL policies whose safety cues transfer directly to real indoor scenes without photorealistic rendering or domain adaptation. The approach enables portable, low-power SSD prototypes and lays the groundwork for forthcoming clinical validation of safe, real-time navigation aids for the visually impaired. • The limited availability of navigation data for visually impaired individuals restricts the development of AI-driven assistive technologies. • NavIndoor is a new software tool that generates procedurally created mazes, enabling the rapid extraction of synthetic, human-like navigation data. • Synthetic virtual-environment data, combined with reinforcement learning, enables real-time extraction of low-dimensional cues for safe pathway identification.
Sarbout et al. (Fri,) studied this question.