This paper presents the participation of the Microsoft Research RADPHI3 team in the Hidden-RAD Challenge: Hidden Causality Inclusion in Radiology Reports. The task aims to recover hidden causality from radiology reports, optionally accompanied by their corresponding frontal chest X-rays (CXRs). We fine-tune small language models, specifically Rad-Phi-3.5 Vision-CXR, to recover causality analysis in both language-only and multi-modal settings, given radiology reports and radiology images as inputs. We also include baselines of various models in the general domain, including models specifically tuned for reasoning tasks such as GPT-4o, LLaMA 3.3, Phi4, DeepSeek, OpenAI o1, OpenAI o1-mini, and OpenAI o3-mini3. Through these experiments, we evaluated the effectiveness of general-domain, reasoning-specialized, and fine-tuned domain-specific small language models in generating causal explanations given radiology reports and images optionally as inputs.
Ranjit et al. (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: