Evaluating Large Language Models on Spatial Tasks: A Multi-Task Benchmarking Study | Synapse