What question did this study set out to answer?

To develop a transparent and effective deep reinforcement learning framework for adaptive patrol of robots in complex environments.

April 16, 2026Open Access

Explainable and transferable deep reinforcement learning for adaptive patrol of rail-guided robot system

Key Points

To develop a transparent and effective deep reinforcement learning framework for adaptive patrol of robots in complex environments.
Utilized Deep Deterministic Policy Gradient (DDPG) for continuous speed-control policy learning from images.
Integrated Grad-CAM to provide visual explanations of speed decisions.
Employed CycleGAN for domain adaptation to ensure real-world applicability without retraining.
Conducted experiments in simulation and physical environments to evaluate performance.
The robot successfully adapts its patrol speed in response to environmental complexity.
The learned policy offers clear visual explanations for its speed decisions.
Grad-CAM confirms semantic consistency in adapted images, maintaining task-relevant cues.

Abstract

Intelligent facility management systems can reduce the workload of human operators by enabling autonomous operation. However, the lack of transparency in existing machine learning-based systems often hinders user trust, especially in safety-critical environments such as industrial and public facilities. To ensure reliability and accountability, autonomous systems must not only perform effectively but also provide human-understandable explanations for their actions. This article presents an explainable deep reinforcement learning framework for a rail-guided patrol robot that adaptively controls its speed based on the visual complexity of its surroundings. The proposed system employs the Deep Deterministic Policy Gradient (DDPG) algorithm to learn a continuous speed-control policy directly from image-based observations. To enhance transparency, Gradient-weighted Class Activation Mapping (Grad-CAM) is integrated into the actor network to visualize which spatial regions of the input most strongly influence speed decisions, providing post hoc explanations of the model’s decisions. To support real-world deployment, we incorporate a Cycle-Consistent Generative Adversarial Network (CycleGAN)-based domain adaptation module that transforms real camera images into a simulation-compatible visual style, enabling the trained policy to operate without additional retraining. Grad-CAM is also used to assess the semantic consistency of translated images and verify that domain adaptation preserves task-relevant visual cues. Because the proposed framework is designed around lightweight visual inputs and compact neural networks, its computational demand remains modest and suitable for embedded execution. Grad-CAM analysis is used for explainability rather than for action generation, and its computation does not affect the timing of the control loop. The framework is evaluated through extensive experiments in both simulation and a physical testbed environment. Results demonstrate that the robot successfully adjusts its patrol speed in response to scene complexity and that the learned policy provides coherent and meaningful visual explanations. These findings highlight the potential of combining deep reinforcement learning, visual domain adaptation, and explainable AI to realize trustworthy and adaptable autonomous patrol systems.

Bookmark

View Full Paper

Cite This Study

Lee et al. (Tue,) studied this question.

synapsesocial.com/papers/69e07de52f7e8953b7cbeda4 https://doi.org/https://doi.org/10.7717/peerj-cs.3722

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Bookmark

View Full Paper