Large Language Model-Based Evaluation of Medical Question Answering Systems: Algorithm Development and Case Study | Synapse