DEX-TTS: Diffusion-based EXpressive Text-to-Speech with Style Modeling on Time Variability | Synapse