Human Action Recognition (HAR) is a pivotal area in computer vision, video surveillance, and human-computer interaction (HCI), driven by the need for efficient and accurate models to enhance HCI experiences. Traditional HAR methods often rely on hand-crafted features and shallow learning techniques, which limits their ability to capture complex patterns. In contrast, this study proposes an efficient HAR model that leverages deep neural networks, specifically a combination of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), to enhance HCI through AI-powered action understanding. The model employs a pre-trained EfficientNetB7 network to extract rich spatial features from video frames, followed by a Long Short-Term Memory (LSTM) network to capture long-range temporal dependencies. This architecture enhances recognition accuracy while reducing computational complexity, making it highly suitable for HCI applications. Experimental results demonstrate the superior performance of the model, achieving a classification accuracy of 97.8% on the UCF101 dataset and 80.1% on the HMDB51 dataset, outperforming state-of-the-art HAR models. The proposed model eliminates the need for auxiliary assistive techniques like data augmentation, highlighting its efficiency and tremendous potential for real-world HCI applications that rely on accurate and efficient recognition of human actions.
Building similarity graph...
Analyzing shared references across papers
Loading...
Noorah Alghasham
Waleed Albattah
PLoS ONE
Building similarity graph...
Analyzing shared references across papers
Loading...
Alghasham et al. (Thu,) studied this question.
www.synapsesocial.com/papers/69abc2555af8044f7a4ebe07 — DOI: https://doi.org/10.1371/journal.pone.0343132