Photorealistic fire scene video generation via multimodal large language model and pre-trained video diffusion model | Synapse