In recent years, deep learning models represented by multi-modal large language models (MLLMs) have been widely applied in the medical domain. As image quality critically impacts diagnostic performance, blind image quality assessment (BIQA) has become essential. However, conventional BIQA methods are grounded in human visual perception, which differs substantially from that of MLLMs, potentially leading to suboptimal image utility assessment and increased misdiagnosis risk. To this end, we propose MedQM, the first BIQA framework for medical MLLMs. We introduce a novel BIQA model, MedQM-I, which leverages Medical Textual Priors and Implicit Feature Queries to guide attention to diagnostically important regions and features, with a gated Mixture-of-Experts for adaptive and robust scoring. Furthermore, we present an innovative automatic MLLM vision-oriented score labeling approach: MedQM-D efficiently and accurately computes image diagnostic loss, and a sigmoid-based quality mapping converts this loss into a quality score, enhancing both training stability and interpretability. Experimental results demonstrate that MedQM notably outperforms traditional BIQA methods, validating its effectiveness.
Yang et al. (Thu,) studied this question.