In the era of digital communication, the demand for expressive, lifelike avatars is rapidly increasing especially in domains where user identity privacy is paramount.We introduce SynthFace, a novel and privacy-preserving framework for generating realistic talking avatars that faithfully convey user expressions while safeguarding their real-world identity. Unlike conventional methods that depend on GAN-generated samples or real facial imagery, SynthFace employs a key-conditioned synthetic identity generator, which deterministically produces a unique, non-invertible face representation. This identity is derived from a cryptographic hash of a secret key and a reference image, ensuring both unlinkability to any real individual and consistent reproducibility. To animate the avatar, we extract facial expressions and head pose from a source video using 3D Morphable Models (3DMMs), and seamlessly fuse these with the synthetic identity at the coefficient level. A mel-spectrogram driven expression generator then synthesizes temporally coherent lip and head movements, enabling accurate speech-driven animation. We rigorously evaluate SynthFace across multiple dimensions: identity protection using ArcFace Cosine Similarity and SSIM, expression fidelity using a Facial Expression Recognition (FER) classifier, GradSim Similarity, and Dlib’s 68- point landmark distances, and perceptual quality via PSNR and optical flow-based temporal consistency. Experimental results and qualitative visualizations demonstrate that SynthFace achieves high expression realism while offering strong identity suppression, positioning it as a powerful tool for privacy-sensitive applications such as teletherapy, virtual education, and secure online communication.
Kaur et al. (Wed,) studied this question.