What question did this study set out to answer?

This research aims to optimize personalized professional development paths for teachers in cybersecurity education using advanced machine learning techniques.

February 14, 2026Open Access

Adaptive Deep Reinforcement Learning for Optimizing Teacher Professional Development Path

Key Points

This research aims to optimize personalized professional development paths for teachers in cybersecurity education using advanced machine learning techniques.
Developed an adaptive deep reinforcement learning framework for teacher development path planning.
Introduced improved models: Deep Q-Network (DQN) with a multi-layer perceptron and a cubic dynamic reward function.
Implemented PER-D3QN model using prioritized experience replay and dueling double deep Q-learning.
Improved DQN achieved performance scores of up to 0.36, significantly exceeding traditional DQN scores.
PER-D3QN model outperformed the ERDQN baseline with an average reward of 4.738 compared to 2.021.
Average score for PER-D3QN was 2.799 after 6000 training rounds, surpassing 1.946 for ERDQN.

Abstract

ABSTRACT Ensuring equitable access to cybersecurity expertise has become increasingly critical, considering the growing complexity of digital threats. As educators are tasked with delivering instruction in areas such as computer fraud prevention and network security, there is a pressing need for adaptive, data‐informed professional development systems that can support individualized learning paths. To address this challenge, this study proposes an adaptive deep reinforcement learning framework for optimizing personalized teacher development path planning in cybersecurity education. Two enhanced models are introduced: an improved Deep Q‐Network (DQN) that integrates a multi‐layer perceptron, a cubic dynamic reward function, and an adaptive exploration strategy; and a PER‐D3QN model that combines dueling double deep Q‐learning (D3QN) and prioritized experience replay (PER) to mitigate Q ‐value overestimation and accelerate convergence. Experimental evaluation using real‐world teacher data demonstrates that the improved DQN achieved average performance scores up to 0.36, compared to 0.054–0.068 for the traditional DQN. Moreover, the PER‐D3QN model outperformed the ERDQN baseline, attaining an average reward of 4.738 versus 2.021, and an average score of 2.799 after 6000 training rounds, compared to 1.946 for ERDQN, indicating that network update speed has also been significantly improved. This research not only helps to enhance teachers' professional knowledge and technical application ability in the field of network security, but also provides scientific methodological support for educational institutions to ensure that they are aligned with changing security threats. Furthermore, this study emphasizes the importance of interdisciplinary cooperation and encourages experts from computer science, education, and psychology to work.

Adaptive Deep Reinforcement Learning for Optimizing Teacher Professional Development Path

Key Points

Abstract

Cite This Study