Benchmarking Large Language Models with a Unified Performance Ranking Metric | Synapse