Multi-Level Cross-Modal Alignment for Image Clustering | Synapse