Efficient Real-Time Scene Description with Vision-Language Models | Synapse