To further improve the facial expression generation of virtual characters, this paper aims to study how the facial expressions of virtual characters are affected by changes in their voices. To this end, we explored a design method that allows facial expressions to "understand" sounds, and designed a step-by-step experimental process: Starting with the key acoustic features of voice, corresponding facial movements are gradually derived to generate continuously changing expressions, and user experiments are used to test their realism and emotional matching. The results show that voice emotion parameters can provide effective clues for the inference of action units, the generated high-quality expressions are close to real expressions in terms of naturalness, and the differences in emotional consistency under different quality conditions can be effectively perceived by users. Therefore, this study provides a more natural path for the expression of emotions in virtual interactions.
Ge et al. (Sat,) studied this question.