Task-aware cross-modal refinement and liquid fusion for text-visual grounding | Synapse