This paper presents a state-of-the-art survey of feature attribution techniques employed in explainable AI. We organize the existing literature into a proposed taxonomy of model-agnostic and model-specific approaches. We analyze the formal definitions, mathematical formulations, usage contexts, strengths, and limitations of these methods. A comparative analysis highlights key trade-offs concerning model agnosticism, explanation form, computational cost, and fidelity to the model. We find that while model-agnostic techniques offer broad applicability by treating models as oracles, often at a higher computational cost, model-specific methods leverage internal model architecture or gradients for potentially more efficient and faithful explanations, albeit with reduced generality.
Saidjon Kamolov (Tue,) studied this question.