Vision-based decision-making is relevant to many domains, including safety-critical ones where transparency matters as much as performance. Therefore, automating sequential decision making in such settings requires approaches that balance effectiveness with interpretability. While deep reinforcement learning techniques based on artificial neural networks have achieved strong performance, their black-box nature typically necessitates post hoc explainability analyses. To address this limitation, we propose an approach based on Graph-based Genetic Programming (GGP) that generates policies in the form of computer code, which is fully observable and inherently interpretable. To improve both performance and robustness, we expose GGP-based visual control policies to multiple representative conditions during optimization, mitigating convergence to strong local optima and fragility under changing conditions. Finally, to gain insights into the evolutionary search dynamics, we employ Search Trajectory Networks, an analytical and visualization tool for studying optimization behavior. Our results demonstrate that the resulting policies approach human-level performance and empirically confirm the presence of strong local optima acting as attractors during evolution, providing new insights into the behavior and potential limitations of interpretable evolutionary policy search approaches.
Berghegger et al. (Sat,) studied this question.
Synapse has enriched 3 closely related papers on similar clinical questions. Consider them for comparative context: