In contrast to human action recognition (HAR), understanding complex human behaviour (CHB), consisting of multiple basic actions, poses a significant challenge for researchers due to its extended duration, numerous types, and substantial data-labeling expenses. In this paper, a new approach to recognize CHB from a semantic point of view is proposed, which can be roughly summarized as judging by action quantization and action combination similarity. To fully evaluate the effectiveness of our method, the self-collected dataset – HanYue Action3D is extended to become the first public skeleton-based dataset with complex behavior samples and temporal calibration. Experimental results have demonstrated the feasibility and universal superiority of our method. Moreover, our method’s zero-shot learning capability bridges the divide between laboratory settings and real-world applications.
Xie et al. (Thu,) studied this question.