Abstract Existing DRL-based Traffic Engineering (TE) approaches in Software-Defined Networking (SDN) often suffer from three practical issues: an excessively high-dimensional link-weight action space, routing instability caused by global weight perturbations, and noisy exploratory experience that hinders stable multi-objective learning. These limitations reduce convergence efficiency and degrade decision reliability under dynamic traffic demands. To address them, we propose CLPE-TE, a DDPG-based TE framework that combines a structural- and stability-aware critical-link selection strategy to restrict optimization to a compact, high-impact action subset, and a performance-driven multi-sample refinement mechanism that generates improved candidate actions around the actor output and a stable baseline, which are selectively injected into training via a dual-buffer replay scheme. The resulting policy achieves better trade-offs among delay, load balancing, and rerouting stability. Experimental evaluation on the Abilene, CERNET, and GÉANT backbone networks shows that, compared with representative baselines, CLPE-TE reduces maximum link utilization by up to 25%, lowers average end-to-end delay by 66%, and consistently achieves lower rerouting overhead. The framework further demonstrates strong robustness under bursty traffic scenarios, offering a reliable and practical solution for dynamic TE in SDN.
Qi et al. (Mon,) studied this question.