Abstract To address the task of generating human motion from complex natural language instructions, this paper proposes a 3D skeleton-based motion generation method that integrates an action nesting graph. The method first constructs the action nesting graph through a language parsing module to capture the segmented structure of actions in the instruction. Then, a graph convolutional neural network is used to model the correspondence between the nested structure and the keyframes of the skeleton. A stage-wise decoupling module is introduced to improve the naturalness of motion transitions. On the KIT Motion and BEAT-Motion datasets, this method achieves improvements of 14.7% in structural preservation rate and 10.2% in stage boundary consistency. The results demonstrate that the proposed nesting-based modeling mechanism effectively enhances the model’s ability to interpret complex composite actions and improves the quality of motion generation
Hart et al. (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: