Learning Alignment for Multimodal Emotion Recognition from Speech | Synapse