Cross-modal emotion recognition with causality inference in human conversations | Synapse