What type of study is this?

This is a Cohort Study study (also classified as: Quantitative Study, Experimental Study).

September 12, 2025Open Access

AEye: A Novel First-Person Human Path Prediction Architecture Leveraging Graph-Based Social Interaction and Probabilistic Multimodal Decoding

Key Points

AEye improves accuracy in predicting pedestrian movement from a first-person view in dynamic and occluded settings.
Experimental results show significant enhancements in Average Displacement Error (ADE) and Final Displacement Error (FDE).
The architecture integrates a Temporal Encoder, Interaction Graph Network, and Trajectory Decoder for comprehensive forecasting.
These advancements highlight AEye's potential applications in assistive robotics, AR/VR environments, and autonomous systems.

Abstract

Accurately forecasting pedestrian motion from a first-person perspective is critical for advancing assistive robotics, augmented and virtual reality (AR/VR), and natural human-robot interaction. Unlike traditional third-person trajectory prediction, first-person forecasting introduces unique challenges such as egocentric camera motion, dynamic shifts in viewpoint, and partial observability of the scene. To address these complexities, I present the AEye, a novel deep learning architecture that integrates temporal sequence modeling, graph-based interaction reasoning, and probabilistic multimodal decoding for robust and socially-aware trajectory prediction. AEye consists of three core modules: (1) a Temporal Encoder that models individual agent motion dynamics using a stack of recurrent neural units, capturing intricate historical movement patterns; (2) an Interaction Graph Network that captures complex social and spatial dependencies between agents through dynamic graph attention mechanisms; and (3) a Trajectory Decoder that outputs a comprehensive multimodal probabilistic distribution over a range of plausible future positions. Unlike prior approaches that rely on simplified social pooling or monolithic attention mechanisms, AEye explicitly fuses local egocentric perception with explicit inter-agent reasoning, enabling accurate predictions even in dense, dynamic, and occluded environments. To validate its performance, I benchmark AEye on widely used first-person trajectory datasets, including EgoPedestrian and JAAD, as well as a custom egocentric dataset collected using wearable cameras. Experimental results demonstrate that AEye achieves significant improvements in key evaluation metrics, including Average Displacement Error (ADE) and Final Displacement Error (FDE), when compared to state-of-theart baseline models such as Social-LSTM, Social-GAN, and Transformer-based architectures. Furthermore, extensive ablation studies highlight the complementary and essential contributions of the temporal and graph-based modules. These findings suggest that AEye provides a powerful and computationally efficient foundation for enabling safer and more natural navigation in next-generation assistive exoskeletons, autonomous mobility devices, and immersive AR/VR systems.

Read Full Paperexternally

AIに質問

Bookmark

View Full Paper