Key points are not available for this paper at this time.
Human Action Recognition (HAR) for CCTV-oriented applications is still a challenging problem. Real-world scenarios HAR implementations is difficult because of the gap between Deep Learning data requirements and what the CCTV-based frameworks can offer in terms of data recording equipments. We propose to reduce this gap by exploiting human poses provided by the OpenPose, which has been already proven to be an effective detector in CCTV-like recordings for tracking applications. Therefore, in this work, we first propose ActionXPose: a novel 2D pose-based approach for pose-level HAR. ActionXPose extracts low- and high-level features from body poses which are provided to a Long Short-Term Memory Neural Network and a 1D Convolutional Neural Network for the classification. We also provide a new dataset, named ISLD, for realistic pose-level HAR in a CCTV-like environment, recorded in the Intelligent Sensing Lab. ActionXPose is extensively tested on ISLD under multiple experimental settings, e.g. Dataset Augmentation and Cross-Dataset setting, as well as revising other existing datasets for HAR. ActionXPose achieves state-of-the-art performance in terms of accuracy, very high robustness to occlusions and missing data, and promising results for practical implementation in real-world applications.
Building similarity graph...
Analyzing shared references across papers
Loading...
Federico Angelini
Thales (United Kingdom)
Zeyu Fu
University of Exeter
Yang Long
Durham University
IEEE Transactions on Multimedia
Newcastle University
Durham University
Inception Institute of Artificial Intelligence
Building similarity graph...
Analyzing shared references across papers
Loading...
Angelini et al. (Tue,) studied this question.
synapsesocial.com/papers/6a2251e03b8e99975a4ecea3 — DOI: https://doi.org/10.1109/tmm.2019.2944745