Decrypt Modality Gap in Multimodal Contrastive Learning: From Convergent Representation to Pair Alignment | Synapse