Neurosymbolic AI (NeSy) offers a promising approach to addressing interpretability challenges in artificial intelligence by bridging neural networks and symbolic reasoning. This paper explores how NeSy can enhance Mechanistic Interpretability (MI) and Explainable AI (XAI) by providing transparent insights into AI decision-making processes. By integrating symbolic representations with neural networks, NeSy mitigates the black-box nature of traditional AI systems, enabling a deeper understanding of their internal mechanisms. We propose a structured five-phase methodology that integrates NeSy with MI, including component identification, symbolic alignment, causal verification, evaluation metrics, and iterative refinement. This framework provides a novel, theoretically grounded approach for embedding structured knowledge and logical reasoning into AI models while maintaining performance. To ground the framework in practice, a hypothetical illustrative scenario in medical imaging operationalizes all five phases, clarifying how each metric could be applied, what outcomes would be expected in principle , and what limitations practitioners should anticipate. No empirical experiments are conducted; the scenario is intended solely to aid conceptual understanding of the framework. The study concludes with a forward-looking analysis of NeSy’s potential role in advancing Artificial General Intelligence (AGI) through greater transparency, accountability, and alignment with human values.
Building similarity graph...
Analyzing shared references across papers
Loading...
Abraham Itzhak Weinberg (Tue,) studied this question.
synapsesocial.com/papers/6a2117dfd499ed480b170c34 — DOI: https://doi.org/10.1007/s44163-026-01424-8
Abraham Itzhak Weinberg
Tel Aviv University
Discover Artificial Intelligence
Tinbergen Institute
Building similarity graph...
Analyzing shared references across papers
Loading...