Animals are integral to our society, contributing to the richness and variety of life and providing companionship for many people. Estimating 3D shapes and poses from animal imagery is vital for studying animal behavior and creating lifelike representations for educational and virtual reality applications. Existing approaches for single-view animal reconstruction either struggle in generating 3D models well aligned with the input image or lack fine details and consistent topology. To address these issues, we propose HiAnimal, a novel framework that aims to generate high-fidelity and animatable 3D animals with strong scaling ability to novel data. HiAnimal tackles the challenge by integrating the merits of pixel-aligned implicit prior and the parametric model. Particularly, we predict the per-pixel occupancy-uv field based on an intermediate normal map representation inferred from the input image. This improves the model's ability to capture fine details and generalize across different instances. We further incorporate the parametric shape prior by aligning the SMAL model with the estimated reconstruction, ensuring consistent mesh topology and more regularized results with cleaner geometry. We achieve this by introducing a novel method that estimates a pixel-aligned UV correspondence field for precise alignment. Extensive experiments have verified our superior performance over the state-of-the-art approaches.
Zhang et al. (Thu,) studied this question.