Key points are not available for this paper at this time.
In this paper, we propose a monocular 3D pose estimation method which explicitly takes into account the angles between the camera optical axis and bones (camera-bone angles) as well as temporal information. The proposed method combines a 2D-to-3D-based method, which predicts a 3D pose from a sequence of 2D poses, and convolutional neural network (CNN) and includes novel regularization loss to enable the CNN to extract camera-bone-angle information. The camera-bone-angle and temporal information suppress ambiguity of 2D-to-3D-based methods where the same 2D pose can be mapped to multiple 3D poses. Experiments on the Human3.6M and MPI-INF-3DHP datasets showed that the proposed method improved the performance by 5.1 mm and 2.1 mm in terms of mean per joint position error (MPJPE) respectively.
Ishii et al. (Mon,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: