Query - Dependent Video Representation for Moment Retrieval and Highlight Detection | Synapse