The high penetration of distributed energy resources (DERs) poses numerous challenges to community energy management, including intense source-load stochasticity, synchronized load surges triggered by multi-agent gaming, and potential privacy breaches. To tackle these issues, this paper proposes a coordinated energy trading framework driven by an intermediate market-rate pricing mechanism. Within this framework, a novel Multi-Agent Transformer Proximal Policy Optimization (MATPPO) algorithm is developed, adopting an LSTM–Transformer hybrid architecture and the centralized training with decentralized execution (CTDE) paradigm. During centralized training, an LSTM network extracts temporal evolution features from source-load data to handle environmental uncertainty, while a Transformer-based self-attention mechanism reconstructs the dynamic agent topology to capture spatial correlations. In the decentralized execution phase, prosumers make independent decisions using only local observations. This eliminates the need to upload internal device states, significantly enhancing the privacy of sensitive local information during the online execution phase. Additionally, a parameter-sharing mechanism enables agents to share policy networks, significantly enhancing algorithmic scalability. Simulation results demonstrate that MATPPO effectively mitigates power peaks and reduces the transformer capacity pressure at the main grid interface. Furthermore, it significantly lowers total community electricity costs while maintaining high computational efficiency in large-scale scenarios.
Chen et al. (Wed,) studied this question.