Semantic-Assisted Object Clustering for Multi-Modal Referring Video Segmentation | Synapse