Los puntos clave no están disponibles para este artículo en este momento.
Inspired by the recent success of hierarchical representation, we propose a new hierarchical variant of latent Dirichlet allocation (h-LDA) for action recognition. The model consists of an appearance group and a motion group, and we introduce a new hierarchical structure including two-layer topics in each group to learn the spatial temporal patterns (STPs) of human actions. The basic idea is that the two-layer topics are used to model the global STPs and the local STPs of the actions respectively. Two groups of discrete words are generated from two complementary kinds of features for each group. Each topic learned in these two groups is used to describe a particular aspect of the actions. Specifically, the mid-level topics are learned to describe the local STPs by including the geometric structure information in the lower-level words. The top-level topics are learned from the mid-level topics and are the mixture distribution of the local STPs, which makes the top-level topics appropriate to represent the global STPs. In addition, we give the learning and inference process by Gibbs sampling with reasonable assumptions. Finally, each sample is discriminatively represented as the probabilistic distribution over the global STPs learned by the proposed h-LDA. Experimental results on two datasets demonstrate the effectiveness of our approach for action recognition.
Yang et al. (Fri,) studied this question.