What question did this study set out to answer?

This research aims to enhance the adaptability of slicing policies in mobile networks using a Continual Mixture of Experts framework.

February 14, 2026Open Access

CoMEx: Continual Mixture of Experts for Fast Policy Adaptation in RAN Slicing

Key Points

This research aims to enhance the adaptability of slicing policies in mobile networks using a Continual Mixture of Experts framework.
Proposed the CoMEx framework for fast policy adaptation in network slicing.
Pre-trained and froze multiple expert policies based on diverse SLA preferences.
Introduced a DRL-based gating network that combines expert actions for real-time adjustments.
Implemented a masked expert expansion mechanism for incorporating new experts.
Achieved a mean score of 78.95 under unseen SLA weights, outperforming existing methods by 2.40% and 27.67%.
Showed a 7.08% performance boost upon adding a fourth expert, reaching a score of 84.54.

Abstract

Network slicing is a cornerstone of 5G/6G vertical services, yet practical deployments require mobile network operators (MNOs) to adjust slice service level agreement (SLA) weights based on quality of experience (QoE), causing rapid non-stationary objective changes that can destabilize deep reinforcement learning (DRL) slicing policies and necessitate retraining. This paper proposes Continual Mixture of Experts (CoMEx) for fast policy adaptation. CoMEx pre-trains and freezes multiple expert policies under diverse SLA preferences, explicitly appends the SLA weight vector to observations, and trains a DRL-based gating network to fuse expert actions at the step level for fast adaptation to unseen SLA configurations. To broaden coverage without degrading existing experts, CoMEx further incorporates a masked expert expansion mechanism that incrementally adds new experts and fine-tunes the gate. Step-level DRL gating demonstrates superior generalization in RAN slicing, attaining a mean score of 78.95 under unseen SLA weights—surpassing episode-level and supervised gating by 2.40% and 27.67%, respectively. Moreover, CoMEx’s extensibility is highlighted by a 7.08% performance boost (reaching 84.54) upon the addition of a fourth expert. Such results confirm the framework’s capacity for timely and robust policy adaptation in non-stationary SLA environments.

Bookmark

View Full Paper

Bookmark

View Full Paper

CoMEx: Continual Mixture of Experts for Fast Policy Adaptation in RAN Slicing

Key Points

Abstract

Cite This Study