Human motion prediction involves accurately modeling and forecasting the future trajectories of human movements. It is a key step in understanding complex human dynamics and has broad applications in areas such as rehabilitation training, and autonomous driving. The complexity of motion differs between the core skeletal joints and the distal limb joints, and so does their similarity to the trajectory of the hip joint, resulting in varying prediction errors for different joints. In this paper, a two‐stage 3D human motion prediction algorithm based on spatiotemporal enhancement (STE) is proposed, combining spatiotemporal graph convolutional networks (ST‐GC) and self‐attention (SA) mechanisms. Our method estimates 3D human motion by separating core skeletal joints from distal limb joints. In predicting core skeletal joints, the spatial graph convolution(S‐GC) and spatiotemporal attention (STA) mechanisms incorporate learnable graph enhancement matrix and spatio‐temporal biases to improve motion prediction stability. For distal limb joints, the STA mechanism employs a learnable spatiotemporal graph enhancement matrix (ST‐GEM). The proposed method is validated on three datasets. Experimental results demonstrate the rationale of using a partitioned and staged strategy for the skeletal structure and verify the effectiveness of the motion prediction models for both core skeletal joints and distal limb joints. © 2026 Institute of Electrical Engineers of Japan. Published by Wiley Periodicals LLC.
Yang et al. (Thu,) studied this question.