What question did this study set out to answer?

The aim is to investigate how large language models impact agent explainability and the inherent risks involved.

April 30, 2026Open Access

The Illusion of Explainability with LLMs and LLM-Agents

Key Points

The aim is to investigate how large language models impact agent explainability and the inherent risks involved.
Conducted an architectural analysis of LLM-agents.
Provided an empirical illustration of design choice limitations.
Evaluated the effects of LLM integration on explainability.
Demonstrated that current LLM integration affects the reliability of explanations.
Highlighted that existing methods may undermine user trust.
Proposed a framework for human-oriented explainability in LLM-agents.

Abstract

The formal linguistic capabilities of Large Language Models (LLMs) are increasingly intersecting with the field of Explainable Agency (XAg). The growing adoption of LLM-agents has heightened the need to explain their behaviour to users. However, current methods consistently fail to meet desirable properties of human-centric explanations. Moreover, language models are being used to improve traditional agent explainability methods due to their conversational abilities and theirresemblance to human responses, yet are often used without sufficient consideration of their limitations. In this paper, we show how the inclusion of LLM components in the agent architecture affects the reliability of the explanations produced. We argue that the widespread reliance on LLMs in XAg is polluting the definition of explanation and explainability. We question current methods of agentic explainability and argue that their risks may undermine trust. Through an architectural analysis and an empirical illustration, we highlight how certain design choices may limit the kinds of explanatory queries that can be reliably answered. Finally, we propose what human-oriented explainability should entail in LLM-agents, and we expose the limitations and opportunities of LLM s’ integration into agent explainability.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Sara Montese

Sergio Alvarez-Napagao

Victor Gimenez-Abalos

Actions

Institutions

Universitat Politècnica de Catalunya

Barcelona Supercomputing Center

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

The Illusion of Explainability with LLMs and LLM-Agents

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study