Artificial Intelligence (AI) shows significant potential across healthcare domains, including advanced diagnostics, clinical decision support, and personalized medicine. Despite these advancements, the opaque ‘black box’ nature of complex AI models necessitates the application of Explainable Artificial Intelligence (XAI) to ensure trust, accountability, interpretability, and regulatory compliance. This study systematically reviews 76 studies published between 2020 and 2025 that have used XAI in healthcare. Our findings show that XAI models such as SHAP and LIME are predominantly used for structured data applications, such as electronic health records, while other XAI models, such as Grad-CAM and Layer-wise Relevance Propagation (LRP), are mainly used in medical imaging. This study specifically investigates evaluation metrics for operationalizing explainability, including faithfulness, trustworthiness, and regulatory compliance, which distinguishes it from prior descriptive reviews. Our analysis shows that while XAI significantly enhances clinician trust, thorough explanation remains heterogeneous and largely confined to controlled settings and the employed benchmark datasets. Critical barriers to clinical adoption include inconsistent interpretability across data modalities and the lack of standardized evaluation frameworks. Existing XAI techniques often do not correspond with strict regulatory standards such as the EU AI Act, Food and Drug Administration (FDA) guidelines, and the Health Insurance Portability and Accountability Act (HIPAA). This review argues for the urgent standardization of XAI validation and the development of human-centered designs to move beyond algorithmic transparency toward reliable real-world hospital integration.
Eje et al. (Wed,) studied this question.