Finetuning a text-to-audio generation model for room impulse response generation | Synapse