What question did this study set out to answer?

The aim is to develop a reinforcement learning controller to optimize power flow management in grid-connected electric vehicles.

May 9, 2026Open Access

Deep reinforcement learning based dynamic power flow controller for grid-connected EV

Puntos clave

The aim is to develop a reinforcement learning controller to optimize power flow management in grid-connected electric vehicles.
Developed a Twin Delayed Deep Deterministic Policy Gradient (TD3) controller for EV G2V/V2G power flow management.
Modeled the control problem as a Markov Decision Process (MDP) with extensive simulation framework.
Conducted comparative analysis against conventional Proportional–Integral (PI) controller.
TD3 controller achieves active and reactive power tracking errors of 1% and 1.02%, improving tracking accuracy by 96.4% (G2V) and 95.1% (V2G).
Maintains grid-current Total Harmonic Distortion (THD) at 4.63% in G2V and below 5% in V2G mode.
Demonstrates enhanced stability under grid disturbances compared to classical control methods.

Resumen

Electric vehicles (EVs) represent a significant shift in transportation and energy consumption, necessitating efficient power management and grid stabilization. Voltage Source Inverters (VSIs) play a crucial role in three-phase charging infrastructure by facilitating bidirectional power exchange for Grid-to-Vehicle (G2V) and Vehicle-to-Grid (V2G) applications. This paper presents a Twin Delayed Deep Deterministic Policy Gradient (TD3) reinforcement learning controller designed to regulate real and reactive power flow in a grid-connected EV system. By modelling the control problem as a Markov Decision Process (MDP), the proposed dynamic controller learns optimal switching actions to manage voltage and current, enabling effective four-quadrant operation. Experimental results demonstrate that this innovative approach significantly outperforms the conventional Proportional–Integral (PI) controller. While the PI controller exhibits approximately 4% error in power tracking, the TD3 agent reduces active and reactive power tracking errors to 1% and 1.02% respectively, representing an improvement of 96.4% during G2V operation and 95.1% during V2G operation. Furthermore, the system maintains power quality within acceptable limits, achieving a grid-current Total Harmonic Distortion (THD) of 4.63% in G2V mode and below 5% in V2G mode. These findings confirm that the TD3-based VSI controller enables stable, accurate, and intelligent bidirectional power control, maximising the active participation of EVs in future smart grid management. • A comprehensive TD3-based is developed for three-phase EV G2V/V2G power flow management. • The proposed model demonstrates full four-quadrant operation (P–Q plane) with enhanced tracking accuracy and robustness. • Unlike classical PI and Model Free Predictive Controller approaches (MF-PC), the con Troller maintains stability under grid disturbances such as voltage sag, swell, and harmonic injection. • A complete simulation framework is developed, including converter modelling, reward shaping, RL training environment, and deployment pipeline. • The study incorporates C-rate–dependent EV battery stress analysis, which is rarely considered in EV-grid interaction research. • Extensive comparative analysis shows that the proposed controller reduces steady-state error to below 1% and maintains THD within IEEE-519 standards.

Leer artículo completoexternamente

Me gusta

Guardar

Ver artículo completo