Purpose The purpose of this study is to address the challenge of optimizing joint pricing and inventory decisions in supply chains by proposing an Agentic Automation-driven Multi-Agent Reinforcement Learning (MARL) framework. By embedding autonomy, goal-directed behavior and self-improving capabilities into each decision-making entity, this research overcomes limitations of traditional optimization methods in handling demand heterogeneity, variable lead times, dynamic pricing and shared resources such as warehouse capacity. Design/methodology/approach The methodology integrates principles of multi-agent automation within a Centralized-Training–Decentralized-Execution (CTDE) architecture, enabling agents to exhibit proactive, coordinated behavior. Agents are trained using shared global information (e. g. warehouse constraints and cross-product demand patterns) but execute specialized, independent policies for individual Stock Keeping Units (SKUs). The approach combines Deep Reinforcement Learning (RL) with inventory theory to jointly optimize pricing and replenishment decisions under stochastic demand, while enabling autonomous adaptation to changing market conditions. Findings The framework is benchmarked against eight popular optimization and learning approaches: Bayesian Optimization, Genetic Algorithm (GA), Evolutionary Algorithm, Deep Q-Networks (DQN), Newsvendor Model, Economic Order Quantity (EOQ), Proximal Policy Optimization (PPO) and Soft Q-Learning (SQL). The results of this study show that the agentic MARL system achieves strong, balanced performance (524k profit, 94. 9% service) with robust adaptability. The EOQ model offers higher profit (584k, 98. 9% service level) but only in stable environments because of its limited adaptiveness. Other RL methods (PPO and SQL) exhibit high variability, while traditional approaches (GA, rule-based, Bayesian) underperform, lacking the autonomy and learning capacity needed for dynamic business environments. Originality/value This study provides: a scalable architecture enabling autonomous, goal-driven coordination for supply chain optimization; empirical evidence showing the advantages of agentic RL over traditional methods in complex, uncertain settings; and foundational insights for extending agentic AI to real-world applications such as promotion planning, supplier collaboration and end-to-end retail automation. Overall, this work bridges academic research and operational practice, providing a pathway toward intelligent, adaptive and agentic supply chain systems.
Building similarity graph...
Analyzing shared references across papers
Loading...
Sarit Maitra
Journal of Modelling in Management
Alliance University
Building similarity graph...
Analyzing shared references across papers
Loading...
Sarit Maitra (Mon,) studied this question.
www.synapsesocial.com/papers/69d5f14b74eaea4b11a7aef0 — DOI: https://doi.org/10.1108/jm2-07-2025-0347