Hierarchical Cross-Modal Talking Face Generation With Dynamic Pixel-Wise Loss | Synapse