Words Over Pixels? Rethinking Vision in Multimodal Large Language Models | Synapse