What question did this study set out to answer?

This research aims to improve the performance of particle swarm optimization by integrating reinforcement learning techniques.

June 18, 2026Open Access

Reinforcement learning-based adaptive particle swarm optimization

Key Points

This research aims to improve the performance of particle swarm optimization by integrating reinforcement learning techniques.
Introduced a multi-information source-driven strategy to enhance diversity in particle populations.
Utilized proximal policy optimization for adaptive adjustment of PSO parameters.
Developed a dynamic reward function to optimize feedback based on search progress.
RLAPSO significantly increased convergence speed and accuracy on CEC 2013 and CEC 2022 test sets.
Compared to classic PSO variants, RLAPSO displayed superior overall performance in handling complex optimization problems.

Abstract

Abstract The particle swarm optimization (PSO) algorithm has demonstrated excellent performance in several fields as a widely used optimization method. However, PSO frequently encounters difficulties with poor convergence speed and inadequate convergence accuracy while handling complex optimization problems. To solve these problems, this paper proposes a reinforcement learning-based adaptive particle swarm optimization (RLAPSO) algorithm. First, a multi-information source-driven strategy (MISDS) is introduced to enhance population diversity. By integrating comprehensive and mainstream learning concepts, the proposed MISDS optimizes the update mechanism for particles positions and velocities. This enables particles to learn not only the global and their individual optimal solutions but also the individual optimal solutions of other particles, as well as the average of these solutions. Second, to improve both the convergence speed and accuracy of PSO, the proximal policy optimization (PPO) method is used to adjust the PSO parameters adaptively. Finally, to further speed up the convergence and improve the solution accuracy, a dynamic reward function is proposed in the framework of PPO, which can adjust the intensity of the feedback in real-time according to the progress of the search. The experimental results show that RLAPSO considerably increases the convergence speed and convergence accuracy on test sets of both CEC 2013 and CEC 2022. Compared with classic PSO variants and other swarm intelligence (SI) algorithms, RLAPSO exhibits superior overall performance when addressing complex optimization problems.

Ask AI

Helpful

Bookmark

View Full Paper