Accurately forecasting pedestrian motion from a first-person perspective is critical for advancing assistive robotics, augmented and virtual reality (AR/VR), and natural human-robot interaction. Unlike traditional third-person trajectory prediction, first-person forecasting introduces unique challenges such as egocentric camera motion, dynamic shifts in viewpoint, and partial observability of the scene. To address these complexities, I present the AEye, a novel deep learning architecture that integrates temporal sequence modeling, graph-based interaction reasoning, and probabilistic multimodal decoding for robust and socially-aware trajectory prediction. AEye consists of three core modules: (1) a Temporal Encoder that models individual agent motion dynamics using a stack of recurrent neural units, capturing intricate historical movement patterns; (2) an Interaction Graph Network that captures complex social and spatial dependencies between agents through dynamic graph attention mechanisms; and (3) a Trajectory Decoder that outputs a comprehensive multimodal probabilistic distribution over a range of plausible future positions. Unlike prior approaches that rely on simplified social pooling or monolithic attention mechanisms, AEye explicitly fuses local egocentric perception with explicit inter-agent reasoning, enabling accurate predictions even in dense, dynamic, and occluded environments. To validate its performance, I benchmark AEye on widely used first-person trajectory datasets, including EgoPedestrian and JAAD, as well as a custom egocentric dataset collected using wearable cameras. Experimental results demonstrate that AEye achieves significant improvements in key evaluation metrics, including Average Displacement Error (ADE) and Final Displacement Error (FDE), when compared to state-of-theart baseline models such as Social-LSTM, Social-GAN, and Transformer-based architectures. Furthermore, extensive ablation studies highlight the complementary and essential contributions of the temporal and graph-based modules. These findings suggest that AEye provides a powerful and computationally efficient foundation for enabling safer and more natural navigation in next-generation assistive exoskeletons, autonomous mobility devices, and immersive AR/VR systems.
Nithilan Rengapragash (Tue,) studied this question.