Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-Modal Structured Representations | Synapse