Abstract Conventional toolpath strategies in additive manufacturing often ignore the underlying physics, leading to defects and inefficiency. In this paper, we present a reinforcement learning framework that couples policy learning directly with finite-element thermal simulations for powder bed fusion, enabling agents to learn adaptive toolpaths from spatially resolved temperature feedback. Using procedurally generated geometries to ensure robust generalization, we train Deep Q-Network and Proximal Policy Optimization agents in both simplified and thermally coupled environments. Our results demonstrate that learned policies outperform conventional zigzag baselines in terms of speed and generalize effectively to irregular geometries, discovering emergent thermal strategies such as outside-to-inside scanning patterns.
Schmeitz et al. (Wed,) studied this question.