Rethinking attention cues: Multi-Factor guided token pruning for efficient vision-language understanding | Synapse