To meet the growing demand for radiology artificial-intelligence tools, a 3D vision–language model called Merlin was trained on abdominal computed-tomography scans, radiology reports and electronic health records. Merlin demonstrated stronger off-the-shelf performance than did other vision–language models across three hospital sites distinct from the initial training centre, highlighting its potential for broader clinical adoption. Vision–language model outperforms second-best models by an average of 20% across hospital sites.
A Wed, study studied this question.