What question did this study set out to answer?

This research aims to enhance NMPC performance in autonomous driving through reinforcement learning adaptations for real-time weight adjustments.

June 13, 2026Open Access

Reinforcement Learning-Enhanced Adaptive NMPC for Safe Autonomous Driving

Key Points

This research aims to enhance NMPC performance in autonomous driving through reinforcement learning adaptations for real-time weight adjustments.
Integrated Proximal Policy Optimization with NMPC to dynamically adjust weight matrices.
Established a physics-based model for a two-wheeled differential drive vehicle.
Evaluated performance under conditions of friction uncertainty and sensor noise.
Achieved a 71% reduction in tracking error compared to the NMPC baseline with real-time adaptation.
Demonstrated improved safety through integration of Control Barrier Function for real-time obstacle avoidance.
Maintained reliable tracking accuracy and safety margins under various disturbances.

Abstract

Nonlinear Model Predictive Control (NMPC) has garnered significant attention in autonomous systems due to its ability to predict future states and manage complex vehicle dynamics. However, the adaptability of existing NMPC methods is constrained by having to manually set the weight coefficients in the NMPC cost function. This study aims to explore a novel approach that integrates NMPC with Reinforcement Learning (RL), specifically employing Proximal Policy Optimization (PPO), to dynamically adjust NMPC weight matrices. The investigation begins by establishing a physics-based model for a two wheeled differential drive vehicle. A PPO model is then trained and deployed in real time to adapt to the NMPC weight matrices, achieving a 71% reduction in tracking error compared with the NMPC baseline. Importantly, the performance gain arises from PPO’s ability to reshape the NMPC cost function in real time, amplifying both orientation and lateral penalties in curves while relaxing them on straights, thereby enabling adaptive trade-offs between accuracy and control effort that static-weight NMPC cannot achieve. To enhance safety, the controller is integrated with a Control Barrier Function (CBF) layer for real-time obstacle avoidance, while PPO’s real-time weight adaptation contributes to improved tracking performance relative to NMPC+CBF. Finally, robustness evaluations under friction uncertainty, sensor noise, and path disturbances demonstrate that the PPO+NMPC+CBF method maintains reliable tracking accuracy and safety margins.

Read Full Paperexternally

Ask AI

Mark Helpful

Bookmark

Relay

View Full Paper