MemoCMT: multimodal emotion recognition using cross-modal transformer-based feature fusion | Synapse