June 12, 2024

Talking Head Generation Based on 3D Morphable Facial Model

HSHsin-Yu ShenNational Yang Ming Chiao Tung University WTWen-Jiin TsaiNational Yang Ming Chiao Tung University

Key Points

Key points are not available for this paper at this time.

Abstract

This paper presents a framework for one-shot talking-head video generation which takes a single person image and audio clips as input and synthesizes photo-realistic videos with natural head-poses and lip motion synced to the driving audio. The main idea behind this framework is to use 3D Morphable Model (3DMM) parameters as intermediate representation in generating the videos. We design an Expression Predictor and a Head Pose Predictor to predict facial expression and head-pose parameters from audio, respectively, and adopt a 3DMM model to extract identity and texture parameters from the reference image. With these parameters, facial images are rendered as an auxiliary to guide video generation. Compared to widely used facial landmarks, 3DMM parameters are more powerful in representing facial details. Experimental results show that our method can generate realistic talking-head videos and outperform many state-of-the-art methods.

KI fragen

Bookmark

View Full Paper

Cite This Study

Shen et al. (Wed,) studied this question.

synapsesocial.com/papers/68e651cbb6db6435875e2935 https://doi.org/https://doi.org/10.1109/pcs60826.2024.10566437

KI fragen

Bookmark

View Full Paper