Unsupervised Audio-Visual Segmentation with Modality Alignment | Synapse