MAViC: Multimodal Active Learning for Video Captioning | Synapse