Learning Consistent Feature Representation for Cross-Modal Multimedia Retrieval | Synapse