UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection | Synapse