Key points are not available for this paper at this time.
Recent progress has propelled the development of realistic talking-face videos for avatars. Yet, animating 3D cartoon avatars remains intricate due to the imprecise nature of facial-driven data. This often manifests as inconsistent mouth configurations and rigid facial expressions, curbing the animation's realism. Addressing these issues, we introduce a conformer-based framework that derives expression coefficients directly from phonemes, thereby elevating prediction precision and minimizing manual oversight. Furthermore, by harnessing a pre-trained emotion blending module coupled with the keyframe of the target emotional character, we employ a zero-shot adaptation technique. This serves to amplify emotional expressions and bolster the authenticity of lip dynamics. Our methodology adeptly registers nuanced expression shifts in avatars, leading to remarkably lifelike animations, as substantiated by our experimental findings.
Building similarity graph...
Analyzing shared references across papers
Loading...
Zhao et al. (Mon,) studied this question.
synapsesocial.com/papers/68e7397eb6db6435876b2a5d — DOI: https://doi.org/10.1109/icassp48485.2024.10447526
Yi Zhao
Chunyu Qiang
OriginWater (China)
Hao Li
OriginWater (China)
Kyoto University
National Institute of Information and Communications Technology
Kuaishou (China)
Building similarity graph...
Analyzing shared references across papers
Loading...