• DRL framework for joint retrieval and relocation scheduling in multideep warehouses. • Heterogeneous graph representation captures diverse storage entity types. • Instance-Invariant Baseline Regularization enables stable policy learning. • A computationally e!cient lower bound is derived as the instanceinvariant baseline. • Trained policy achieves lower makespan across various unseen warehouse configurations. Multi-deep automated vehicle storage and retrieval systems (AVS/RS) offer high storage density, making them increasingly prevalent in modern logistics. However, their operational efficiency is often constrained by the need to relocate blocking items during retrieval. In this work, we consider a realistic scenario where only a subset of stored items is requested, and relocation naturally arises when target items are blocked by non-requested ones. We propose a deep reinforcement learning (DRL) framework for makespan minimization in multi-deep AVS/RS. The framework features a heterogeneous graph-based state representation that captures three distinct entity types (requested items, non-requested items, and empty locations) along with their structural relationships. The action space is designed to correspond to these node types, enabling the agent to handle both retrieval and relocation decisions within a unified framework. To address the high variance inherent in this problem, we propose the Instance-Invariant Baseline Regularization, which decouples the agent’s performance from the instance’s inherent complexity by deriving a computationally efficient lower bound for each state. Extensive experiments validate the effectiveness of the proposed approach. The agent trained with the proposed regularization demonstrates stable convergence and, more crucially, strong generalization across 64 unseen warehouse configurations of varying scale, consistently outperforming heuristic baselines. These results highlight the potential of DRL for intelligent decision-making in complex warehouse management problems.
Li et al. (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: