August 15, 2025Open Access

Beyond Post hoc Explanations: A Comprehensive Framework for Accountable AI in Medical Imaging Through Transparency, Interpretability, and Explainability

Key Points

LIME achieves superior fidelity (0.81) compared to SHAP and Grad-CAM, indicating improved reliability in medical imaging.
Meta-analysis of 67 studies shows post hoc explanations have poor stability, with significant degradation under noise perturbation.
Proposed framework emphasizes transparency, interpretability, and cautious post hoc explanations for enhanced accountability in AI.
Identified modality-specific patterns indicate the need for tailored XAI approaches to maintain clinical decision-making quality.

Abstract

The integration of artificial intelligence (AI) in medical imaging has revolutionized diagnostic capabilities, yet the black-box nature of deep learning models poses significant challenges for clinical adoption. Current explainable AI (XAI) approaches, including SHAP, LIME, and Grad-CAM, predominantly focus on post hoc explanations that may inadvertently undermine clinical decision-making by providing misleading confidence in AI outputs. This paper presents a systematic review and meta-analysis of 67 studies (covering 23 radiology, 19 pathology, and 25 ophthalmology applications) evaluating XAI fidelity, stability, and performance trade-offs across medical imaging modalities. Our meta-analysis of 847 initially identified studies reveals that LIME achieves superior fidelity (0.81, 95% CI: 0.78–0.84) compared to SHAP (0.38, 95% CI: 0.35–0.41) and Grad-CAM (0.54, 95% CI: 0.51–0.57) across all modalities. Post hoc explanations demonstrated poor stability under noise perturbation, with SHAP showing 53% degradation in ophthalmology applications (ρ = 0.42 at 10% noise) compared to 11% in radiology (ρ = 0.89). We demonstrate a consistent 5–7% AUC performance penalty for interpretable models but identify modality-specific stability patterns suggesting that tailored XAI approaches are necessary. Based on these empirical findings, we propose a comprehensive three-pillar accountability framework that prioritizes transparency in model development, interpretability in architecture design, and a cautious deployment of post hoc explanations with explicit uncertainty quantification. This approach offers a pathway toward genuinely accountable AI systems that enhance rather than compromise clinical decision-making quality and patient safety.

Read Full Paperexternally

Mark Helpful

Bookmark

Relay

View Full Paper