ABSTRACT The rapid adoption of complex “black‐box” models in high‐stakes domains has made interpretability a functional prerequisite for artificial intelligence. This systematic literature review synthesizes the state‐of‐the‐art in Interpretable Machine Learning (IML) by analyzing 352 studies retrieved from a multi‐channel digital ecosystem, including Wiley Online Library, IEEE Xplore, ACM Digital Library, and specialized AI repositories such as arXiv and PMLR. To ensure a comprehensive longitudinal perspective, the review covers a 75‐year window (1950–2025), tracing the field from foundational game‐theoretic axioms—specifically the Shapley value (1953)—to contemporary deep‐learning‐driven breakthroughs. We propose a structured taxonomy that categorizes interpretability methods into three distinct families: intrinsic models, post hoc model‐agnostic techniques (including RISE), and deep learning‐specific methods (including Grad‐CAM and its established variants). Our analysis reveals a paradigm shift from passive feature attribution toward “mechanistic interpretability” and actionable recourse. Furthermore, we critically assess emerging evaluation frameworks, highlighting the gap between algorithmic fidelity and human comprehension. This review provides a rigorous evidence base for researchers and practitioners seeking to develop AI systems that are transparent, robust, and aligned with human reasoning.
Building similarity graph...
Analyzing shared references across papers
Loading...
Shimon Fridkin
Michael Bendersky
Wiley Interdisciplinary Reviews Data Mining and Knowledge Discovery
Holon Institute of Technology
Building similarity graph...
Analyzing shared references across papers
Loading...
Fridkin et al. (Sun,) studied this question.
www.synapsesocial.com/papers/69be37dd6e48c4981c677e47 — DOI: https://doi.org/10.1002/widm.70075
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: