What type of study is this?

September 10, 2025Open Access

The Implementation of Test Time Augmentation in Deep Reinforcement Learning

Key Points

Test time augmentation improves policy stability in deep reinforcement learning, improving performance by 4.78%.
Controlled state perturbation and majority voting enhance agent adaptability, with success rate rising by 9% under stable conditions.
Experimental analysis applied TTA within the LunarLander-v2 environment, demonstrating its efficacy for DRL agents with limited retraining.
The method shows limitations in highly chaotic environments, indicating variability in robustness under extreme randomness.

Abstract

Deep Reinforcement Learning (DRL) models often struggle with generalization and robustness, requiring costly retraining to adapt to environmental changes. To address this, the study proposes Test Time Augmentation (TTA) as a post-training method to enhance the policy stability of DRL agents. This work introduces a novel approach that applies TTA to DRL by leveraging controlled state perturbation, majority voting, and dynamic scaling of augmentations. This method allows agents to adapt to varying conditions without modifying the original model parameters, offering a lightweight yet effective solution to improving robustness. Experimental results on the LunarLander-v2 environment using Deep Q-Networks (DQN) demonstrate a 4.78% performance improvement and a 9% success rate improvement under stable conditions and increased resilience against moderate noise. However, performance declines in highly chaotic environments, highlighting TTA’s limitations under extreme randomness. Overall, this study bridges the gap between TTA in computer vision and DRL, offering insights into practical and computationally efficient methods for improving policy robustness without retraining.

Read Full Paperexternally

KI fragen

Bookmark

View Full Paper

Cite This Study

C. M. Dai (Wed,) studied this question.

synapsesocial.com/papers/68c198c59b7b07f3a061a9b1 https://doi.org/https://doi.org/10.1051/itmconf/20257801002

KI fragen

Bookmark

View Full Paper