Large-scale evaluation of multimodal large language models for pneumothorax detection | Synapse