Building and better understanding vision-language models: insights and future directions | Synapse