With the application of Reinforcement Learning (RL) in security-sensitive fields such as healthcare, autonomous driving, and finance, the lack of Explainable Reinforcement Learning (XRL) restricts the technology's grounding and social trust. This paper reviews XRL's progress, constructing a framework covering core challenges, methods, and applications. Core challenges include black-box decision-making, reward bias, and complex multi-intelligence interaction. Among the existing methods, the intrinsic interpretability of the model is limited by the generalization ability, the ex-post interpretation method faces the problem of poor local-global consistency, and the hybrid method is difficult to dynamically adapt due to the high complexity of the system. XRL shows potential in medical decision-making, autonomous driving safety, and financial risk control through causal reasoning and multimodal interpretation, but further optimization is needed. This paper emphasizes building a standardized evaluation system and explores cutting-edge directions like cross-domain migration and multimodal human-computer collaborative interpretation to deepen XRL's theoretical framework and promote its safe application in high-risk domains.
Building similarity graph...
Analyzing shared references across papers
Loading...
Shuqi Yang
ITM Web of Conferences
Building similarity graph...
Analyzing shared references across papers
Loading...
Shuqi Yang (Wed,) studied this question.
www.synapsesocial.com/papers/68c198cd9b7b07f3a061aadd — DOI: https://doi.org/10.1051/itmconf/20257801039