What question did this study set out to answer?

March 4, 2026Open Access

Agent-based control and machine learning-driven optimizations for communication networks in intelligent transportation systems

Key Points

The study aims to develop advanced methodologies for optimizing communication networks in Intelligent Transportation Systems using agent-based and machine learning techniques.
Implemented supervised and unsupervised machine learning techniques for anomaly detection in ITS traffic.
Developed agent-guided storage optimization strategies for edge-cloud networks.
Proposed RAN slicing and spectrum management methods for effective resource allocation in 5G communications.
Achieved higher accuracy in traffic anomaly detection using proposed machine learning frameworks.
Improved cache hit ratios and resource utilization in data caching techniques for ITS edge networks.
Demonstrated greater energy efficiency in spectrum management using agent-based control methods.

Abstract

Driven by a wealth of theoretical and practical knowledge, the adoption of Artificial Intelligence (AI) in wireless networks is soaring. 5G and Beyond 5G (B5G) communications are expected to rely heavily on data- and experience-driven optimization procedures that employ various Machine Learning (ML), Deep Learning (DL) and Reinforcement Learning (RL) algorithms. Enabled by pervasive data processing and transmission technologies, Intelligent Transportation System (ITS) is getting increasing attention for the future of smart mobility as one of the main pillars of the B5G era for enabling advanced mobile applications while providing social as well as economic benefits. ITS, by design, encompasses vehicular domain and a palette of networked infrastructural elements, namely BSs, Mobile Edge Computing (MEC) servers and Road-Side Units (RSUs), for accomplishing Device-to-Device (D2D) communications. To satisfy multi-dimensional Quality-of-Service (QoS) demands of diverse applications, such as enhanced Mobile BroadBand (eMBB) and Ultra-Reliable and Low-Latency Communications (URLLC), ITS Radio Access Network (RAN) utilizes Vehicle-to-Everything (V2X) links based on high bandwidth necessity or strict latency requirements. Our central objective in this thesis is to pioneer advanced methodologies to enhance ITS networks and communications, employing agent-based stochastic control mechanisms and machine learning driven optimizations for the future urban transportation applications. To apply data-driven performance optimization in the ITS environment, we adopt a holistic approach and propose solutions to enhance various aspects of the ITS domain. These include supervised agent-guided predictive storage optimization technique in ITS edge-cloud networks, such as centralized management for data caching in ITS edge networks; and agent-based ITS RAN resource management mechanisms, such as spectrum allocation and network slicing application optimizations. For the supervised 2) a positive correlation discovery with the proposed CLAD approach between the accuracy of anomaly detection and the number of neurons involved in the computation. As the supervised ML based approach for E2E anomaly detection in ITS data traffic, we propose Graph Neural Network based Anomaly Detection (GNN-AD) framework for achieving reliability in multi-dimensional data streams. Leveraging unsupervised and supervised machine learning algorithms, namely BiDirectional Generative Adversarial Network (BiGAN), affinity propagation and Graph Convolutional Network (GCN), we semantically model the complex distributions of ITS data streams and perform anomaly detection utilizing graph representations. Our contributions for ML and DL based supervised ITS traffic optimization can be stated as: 1) stable performance and a higher anomaly detection accuracy compared to other approaches is achieved with GNN-AD providing better reliability for ITS data streams; 2) a greater Area Under Curve (AUC) value is obtained in the Receiver Operating Characteristics (ROC) space while achieving mostly higher True Positive Rate (TPR) value. Agent-guided predictive storage optimization in ITS edge-cloud networks has been addressed for next-generation Machine-Type Communications (MTC) to alleviate the shortcomings of Cloud Computing (CC) infrastructures in terms of service delays and network loads. For that purpose, we have tackled the optimal data caching problem in geographically distributed ITS edge networks and proposed the Online Monte CArlo planning based data caching (OMCA) scheme. Considering multi-dimensional QoS requirements in the ITS vehicular environment, OMCA uses Monte Carlo Tree Search (MCTS) algorithm with subgoal based temporal abstractions for automatically discovering and optimizing data caching actions. We compared the performance of OMCA with MCTS, Deep Q-Learning (DQL) and recency based cache management algorithms. Simulation results of data caching techniques have been obtained using the numerical computation library TensorFlow and open-source neural-network library Keras using cloud and edge computing resources of the BeIntelli smart mobility platform. Our contributions regarding agent-based control for optimizing storage allocation in ITS edge-cloud networks can be listed as follows: 1) subgoal based online planning is achieved that resulted in better utilization of cache resources and higher cache hit ratios when more resources are available and fewer nodes are involved in the cache policy computation; 2) OMCA and MCTS convergence is observed as the available cache space decreases due to a lack of resource availability based on the long-term caching policy computation with temporal abstractions. RAN slicing is emerging as the de facto spectrum resource allocation method to satisfy diverse QoS requirements in 5G vehicular networks. Single- and multi-policy ITS RAN source management methods have been proposed for optimal policy-based bandwidth utilization, energy-efficient Radio Resource Block (RRB) allocation and multi-policy spectrum utilization purposes. Performance evaluation for single and multiple MS agents have been obtained using python based ITS simulations considering dynamic vehicular eMBB and URLLC slice requests, mobility characteristics, Access Point (AP) availability and diverse V2X traffic. First, we tackle the agent-based RAN slicing optimization problem in 5G V2X communications and present h-DQN based Soft Slicing (HSS) method for model-free opportunistic slice management. HSS consists of a multi-controller learning framework where a high-level meta-controller takes state input for determining a subgoal and a low-level controller decides on the action based on the given subgoal and the state. We compared the performance of HSS with slotted-aloha and traditional model-free and model-based RL algorithms using numerical computation library TensorFlow, open-source neural-network library Keras and numeric computing environment MATLAB. Second, for optimized RAN resource management with agent-based control, we also propose Lazy Skip Markov Decision Process (LS-MDP) formulation for spectrum agents to individually perform fine and coarse stochastic control in an energy-efficient manner depending on performance requirements incorporating varying levels of laziness. Furthermore, we employ LS-MDP based policies in a multi-policy setting and utilize hybrid QoS reward per energy consumption (HQEC) as our performance metric to propose the LazyRAN framework as an energy-efficient RRB allocation approach for multi-policy RAN slicing in B5G ITS edge networks. For the performance evaluation, python based ITS simulation has been performed employing sklearn ML library. Our contributions regarding agent-based control for ITS RAN spectrum management can be listed as: 1) Pursuing task-decomposition based subgoal policies using hierarchical HSS architecture resulted in better utilization of bandwidth resources with increased stability and sample-efficiency; 2) LS-MDP based Q-learning as a fixed spectrum allocation policy had a superior performance in terms HQEC compared to traditional Q-learning based spectrum allocation methods; 3) RRB management with LS-MDP based problem formulation shown to consume overall less energy for computation of resource distribution as well as frequency allocation; 4) dynamic policy utilization with LazyRAN framework that uses multi-policy LS-MDP variations demonstrated greater HQEC accumulation while minimizing energy consumption and maximizing overall network throughput.

Mark Helpful

Bookmark

Relay

View Full Paper