What question did this study set out to answer?

The aim is to enhance coordination in multi-agent systems without incurring high communication costs during training.

March 1, 2026Open Access

Efficient Training in Multi-agent Reinforcement Learning: A Communication-free Framework for the Box-pushing Problem

Key Points

The aim is to enhance coordination in multi-agent systems without incurring high communication costs during training.
Developed the Shared Pool of Information (SPI) framework to provide structured information at initialization
Assessed SPI's efficiency in the box-pushing problem
Focused on improving exploration and decision-making without direct communication
SPI accelerated learning and enhanced coordination among agents
Increased success rates in completing the box-pushing task
Optimized trajectory choices made by agents

Abstract

Abstract Effective coordination is a critical challenge in the design of self-organizing systems (SOSs), particularly in decentralized training where explicit communication incurs high costs. Traditional approaches often rely on inter-agent communication, but this can introduce substantial system overhead. The challenge becomes even more pronounced in multi-agent reinforcement learning (MARL)-based systems, where the training process happens in an end-to-end black-box manner. To address this issue, we explore alternative methods to enhance the exploration phase without relying on direct communication, thereby improving search efficiency. Therefore, the Shared Pool of Information (SPI) is proposed in this paper, which is a communication-free framework designed to provide agents with structured shared information at initialization. By offering a common foundation for exploration, SPI helps guide group action choices and facilitates more effective decision-making. This approach enables agents to learn coordinated behaviors without bringing the high costs associated with continuous communication. The efficiency of SPI is assessed and validated in the box-pushing problem, a task that requires agents to collaboratively maneuver a box toward a goal while avoiding obstacles. Our findings indicate that SPI accelerates learning, enhances coordination, increases success rates, and optimizes trajectory optimization. These results highlight SPI as a promising and scalable approach for scenarios where communication is costly or infeasible.

Efficient Training in Multi-agent Reinforcement Learning: A Communication-free Framework for the Box-pushing Problem

Key Points

Abstract

Cite This Study