What question did this study set out to answer?

The central aim is to improve community energy management by addressing challenges posed by distributed energy resources.

April 3, 2026Open Access

A Spatio-Temporal Attention-Based Multi-Agent Deep Reinforcement Learning Approach for Collaborative Community Energy Trading

Key Points

The central aim is to improve community energy management by addressing challenges posed by distributed energy resources.
Developed a Multi-Agent Transformer Proximal Policy Optimization (MATPPO) algorithm.
Utilized a hybrid LSTM–Transformer architecture for feature extraction.
Adopted a centralized training with decentralized execution framework.
Implemented a parameter-sharing mechanism for agents.
Conducted simulations to validate the approach.
MATPPO effectively mitigated power peaks in community energy usage.
Reduced pressure on transformer capacity at the main grid interface.
Lowered total community electricity costs significantly.
Maintained high computational efficiency in large-scale scenarios.

Abstract

The high penetration of distributed energy resources (DERs) poses numerous challenges to community energy management, including intense source-load stochasticity, synchronized load surges triggered by multi-agent gaming, and potential privacy breaches. To tackle these issues, this paper proposes a coordinated energy trading framework driven by an intermediate market-rate pricing mechanism. Within this framework, a novel Multi-Agent Transformer Proximal Policy Optimization (MATPPO) algorithm is developed, adopting an LSTM–Transformer hybrid architecture and the centralized training with decentralized execution (CTDE) paradigm. During centralized training, an LSTM network extracts temporal evolution features from source-load data to handle environmental uncertainty, while a Transformer-based self-attention mechanism reconstructs the dynamic agent topology to capture spatial correlations. In the decentralized execution phase, prosumers make independent decisions using only local observations. This eliminates the need to upload internal device states, significantly enhancing the privacy of sensitive local information during the online execution phase. Additionally, a parameter-sharing mechanism enables agents to share policy networks, significantly enhancing algorithmic scalability. Simulation results demonstrate that MATPPO effectively mitigates power peaks and reduces the transformer capacity pressure at the main grid interface. Furthermore, it significantly lowers total community electricity costs while maintaining high computational efficiency in large-scale scenarios.

A Spatio-Temporal Attention-Based Multi-Agent Deep Reinforcement Learning Approach for Collaborative Community Energy Trading

Key Points

Abstract

Cite This Study