Background Explainable artificial intelligence (XAI) is used in healthcare to make machine-learning outputs more transparent and clinically usable. This is important because many machine learning models work like a “black box” which can hide bias, reduce trust in the model. XAI addresses this problem by showing which features or image regions influenced a result, either for one patient or across a dataset. Objectives Our objective is to provide a clear, systematic review of how XAI is being used in healthcare. We summarize the main XAI methods, the data and models they are paired with, and how these explanations support clinical understanding across imaging, diagnosis, and rehabilitation. Methods We performed a systematic review with narrative synthesis (2020–2025) of 36 empirical studies across three verticals–Imaging ( n = 10), Diagnosis ( n = 16), and Rehabilitation ( n = 10) that are identified via PubMed/MEDLINE, IEEE Xplore, and Google Scholar, following PRISMA 2020 guidelines. We included research studies that employed XAI in the three mentioned verticals. We excluded review articles and viewpoint studies. Screening numbers were - records identified 1,481; duplicates removed 647; other removals 187; screened 647; excluded 532; reports sought 115; not retrieved 31; assessed 84; full-text excluded 48; included 36. From each study we extracted ML models, XAI methods, study design, methodologies, and dataset/source. Meta-analysis was not undertaken due to heterogeneity. Results Across 36 studies, SHAP was used in 21 studies, Grad-CAM in ~12/36, and LIME in ~11/36. A clear method-modality fit emerged with Imaging predominantly using saliency/heat-map methods, especially Grad-CAM, for spatial evidence. Diagnosis and Rehabilitation were dominated by feature-attribution tools like SHAP and LIME for global and case-level explanations. Many papers combined ≥ 2 explainers to cross-check interpretations namely SHAP+LIME, and Grad-CAM + LIME. Conclusion Recent healthcare XAI demonstrates consistent method-modality fit and frequently combine two or more methods, helping translate opaque predictions into clinician-oriented reasoning. To enable trustworthy deployment, future work should pair these practices with standardized XAI reporting, faithfulness/stability assessments, and external, cross-site validation.
Aravindkumar et al. (Wed,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: