Abstract Recent advances in 3D human mesh recovery from monocular images primarily rely on optimizing the SMPL model using sparse 2D cues such as 2D key points or silhouettes. However, these constraints often lead to incorrect 3D pose estimations due to inherent depth ambiguity. To address this fundamental limitation, we propose DenseSMPLify, a novel optimization framework that leverages dense pseudo‐normal maps estimated by Sapiens as complementary geometric guidance. Our key insight is that normal maps encode richer 3D shape and depth cues than sparse 2D joints, significantly reducing ambiguity during optimization. A critical challenge lies in the misalignment between SMPL‐derived normals and pseudo‐normal maps. We address this via an alignment algorithm based on Continuous Surface Embedding (CSE), which establishes pixel‐level correspondence between the two normal maps before error computation. Experiments demonstrate that DenseSMPLify outperforms SMPLify and other optimization baselines in both pose accuracy and shape realism. Notably, our method consistently improves upon initial poses, even when initialized with state‐of‐the‐art regression models.
Zhang et al. (Thu,) studied this question.
Synapse has enriched 4 closely related papers on similar clinical questions. Consider them for comparative context: