Multimodal Fusion for Vocal Biomarkers Using Vector Cross-Attention | Synapse