A Speech-to-Video Synthesis Approach Using Spatio-Temporal Diffusion for Vocal Tract MRI | Synapse