March 16, 2024

Audio-Driven Talking Face Video Generation with Emotion

JLJiadong LiangKing University FLFeng LuNorth Sichuan Medical University

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

Vivid talking face generation has potential applications in virtual reality. Existing methods can generate talking faces that are synchronized with the audio, but typically ignore the accurate expression of emotions. In this paper, we propose an advanced two-step framework to synthesize talking face videos with vivid emotional appearances. The first step is designed to generate emotional fine-grained landmarks, including the normalized landmarks, gaze, and head pose. In the second step, we map the facial landmarks to latent key points, which are then fed into the pre-trained model to generate high-quality face images. Extensive experiments demonstrate the effectiveness of our method.

Preguntar a la IA

Me gusta

Guardar

Ver artículo completo

Cite This Study

Liang et al. (Sat,) studied this question.

synapsesocial.com/papers/68e73cb2b6db6435876b5e2c https://doi.org/https://doi.org/10.1109/vrw62533.2024.00227

Preguntar a la IA

Me gusta

Guardar

Ver artículo completo