Benchmarking large language models on the United States medical licensing examination for clinical reasoning and medical licensing scenarios | Synapse