VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis | Synapse