What type of study is this?

This is a Quantitative Study study (also classified as: Experimental Study).

September 29, 2025Open Access

MGHanD: Multi-modal Guidance for authentic Hand Diffusion

Key Points

MGHanD enhances hand generation qualities in images by applying multi-modal guidance during inference.
The method achieved significant improvements in hand realism, producing anatomically correct fingers and natural hand shapes.
Quantitative and qualitative evaluations show that MGHanD outperforms traditional models in generating realistic hand images.
Visual and textual guidance, combined with a dynamic hand mask, create superior image outputs without specific conditions.

Abstract

Diffusion-based methods have achieved significant successes in T2I generation, providing realistic images from text prompts. Despite their capabilities, these models face persistent challenges in generating realistic human hands, often producing images with incorrect finger counts and structurally deformed hands. MGHanD addresses this challenge by applying multi-modal guidance during the inference process. For visual guidance, we employ a discriminator trained on a dataset comprising paired real and generated images with captions, derived from various hand-in-the-wild datasets. We also employ textual guidance with LoRA adapter, which learns the direction from `hands' towards more detailed prompts such as `natural hands', and `anatomically correct fingers' at the latent level. A cumulative hand mask which is gradually enlarged in the assigned time step is applied to the added guidance, allowing the hand to be refined while maintaining the rich generative capabilities of the pre-trained model. In the experiments, our method achieves superior hand generation qualities, without any specific conditions or priors. We carry out both quantitative and qualitative evaluations, along with user studies, to showcase the benefits of our approach in producing high-quality hand images.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Taehyeon Eum

Jieun Choi

World Bank

Tae‐Kyun Kim

Korea Advanced Institute of Science and Technology

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

MGHanD: Multi-modal Guidance for authentic Hand Diffusion

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study