Joint Syntax Representation Learning and Visual Cue Translation for Video Captioning | Synapse