Structure-Aware Multimodal Sequential Learning for Visual Dialog | Synapse