What question did this study set out to answer?

The aim is to develop an effective gesture recognition model to enhance communication between autonomous vehicles and pedestrians.

March 18, 2026Open Access

PGR‐Net: A Pedestrian Gesture Recognition Model for Effective AV‐Human Interactions in Autonomous Vehicles

Puntos clave

The aim is to develop an effective gesture recognition model to enhance communication between autonomous vehicles and pedestrians.
Introduced the PGR-Net model using a spatiotemporal deep learning approach.
Created the PGR-Net v1.0 dataset with AV-relevant gesture classes.
Integrated R(2+1)D, a 3D-CNN architecture, with RNNs and self-attention for gesture detection.
Evaluated model performance on the PGR-Net v1.0 dataset.
Achieved 88.29% accuracy with a 12.56% improvement from baseline models.
Demonstrated effective gesture recognition with generalization beyond the training dataset.
Highlighted the significance of short spatiotemporal context in gesture recognition.

Resumen

ABSTRACT Autonomous vehicles (AVs) and advanced driver assistance systems (ADAS) continue to advance, yet effective social coordination with human road users (HRUs) remains a key challenge. This study introduces the PGR‐Net model, a spatiotemporal deep learning (DL) approach for pedestrian gesture recognition (PGR) to bridge the gap in AV‐pedestrian communication. We created the PGR‐Net v1.0 dataset by remapping Jester gesture labels to AV‐relevant classes: Stop, Go, and Greeting/Thanking. Furthermore, a No Gesture class is defined via a sequential hand‐presence rule. The PGR‐Net fuses an R(2+1)D, a three‐dimensional convolutional neural network (3D‐CNN) architecture, and a spatiotemporal stream with hand‐pose landmarks, followed by recurrent neural network (RNN) encoders and self‐attention layers to emphasise gesture‐relevant frames. On the PGR‐Net v1.0 dataset, the PGR‐Netv2 achieves 88.29% accuracy and an absolute 12.56% improvement from the baseline R(2+1)D model. Qualitative tests on single images beyond the dataset indicate sensible generalisation and highlight the importance of short spatiotemporal context for PGR. These results suggest that hand‐augmented spatiotemporal modelling is a viable path toward a robust and AV‐relevant PGR for various traffic scenarios. We discuss current limitations due to the limited availability of PGR‐specific datasets and outline directions for broader in‐the‐wild data and context‐aware modelling to improve applicability.

Leer artículo completoexternamente

Me gusta

Guardar

Ver artículo completo

Cite This Study

Mahdi et al. (Thu,) studied this question.

synapsesocial.com/papers/69ba43884e9516ffd37a4e03 https://doi.org/https://doi.org/10.1049/itr2.70197

Me gusta

Guardar

Ver artículo completo