March 1, 2024

RGB-D Videos Human Action Recognition: A Taxonomy-Based Survey to Learn the Discriminative Features

Key Points

Key points are not available for this paper at this time.

Abstract

In computer vision human action recognition plays an important role in the modern era. Here a sensor can acquired accurately the human action and their interactions from a previously unseen data sequence. Human activity identification in video sequences is a hotspot for computer vision research owing to its practical implications. It encompasses security, surveillance, healthcare, robotics, animation, sports analysis, smart home automation and behavioral analysis. The goal of AI society is to create a system that can observe and understand human behavior and actions completely independently. For example, a robot assistant might aid a patient undergoing home monitoring by analyzing the most effective method of exercise and so avoiding additional injuries, thereby increasing the robot's usefulness to society. This kind of smart technology will be immensely useful to us since it eliminates the need for unnecessary doctor visits, lowers healthcare costs, and allows for constant remote monitoring of the patient. Many feature-based methods, both manually designed and automatically taught, have emerged in the past two decades for identifying human actions in video footage. Traditional methods of human activity identification rely on meticulously designed characteristics that drill down to the most fundamental of motions. Over the last several years, deep learning algorithms have made great progress in a number of different domains, including human location prediction, object recognition, segmentation, audio analysis, object tracking, and super-resolution. In visual recognition tasks, the deep learning model is also crucial. Instead of manually extracting the characteristics, deep learning-based methods provide a more efficient and time-saving alternative. Handcrafted features solutions were shown to be effective, however they over-relied on feature descriptors when attempting action categorization. This type of problem needed additional man hours and specialized knowledge to solve. However, the automated features extraction from raw movies and improved identification rate provided by deep learning-based systems have made them the de facto standard. Despite this progress, several fundamental challenges in human activity recognition remains unanswered, deep learning techniques are introduced with the development of the improved Kinect depth sensor. However, there are hardly any methods for activity recognition that use a mixture of RGB, depth, and 3D-skeleton coordinates. Multimedia information presented in the form of picture sequences is what we often call a video (frame per second). In order to identify happenings in videos, spatio-temporal feature extraction is required. Using data collected from sensors such a video camera, depth sensors, and other modalities, this thesis aims to automatically recognize and analyze human activity or activities.

Bookmark

Cite This Study

Girish Padhan (Fri,) studied this question.

synapsesocial.com/papers/68e76bc9b6db6435876e167c https://doi.org/https://doi.org/10.1109/ic-cgu58078.2024.10530720

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Bookmark