Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation | Synapse