XKD: Cross-Modal Knowledge Distillation with Domain Alignment for Video Representation Learning | Synapse