What is the clinical evidence from this study?

Study design: Other. Intervention: U2 black-box attack vs. White-box attack. Primary outcome: Attack cost.

⌘+K

February 16, 2021Open Access

Reward Poisoning in Reinforcement Learning: Attacks Against Unknown Learners in Unknown Environments

Key Result

The U2 black-box reward poisoning attack strategy achieved an attack cost comparable to the optimal white-box attack against unknown reinforcement learning agents in unknown environments.

Structured PICO

Population

Reinforcement learning (RL) agents with unknown algorithms in an unknown environment

Intervention

U2 black-box reward poisoning attack

Comparator

State-of-the-art white-box attack

Outcome

Attack performance (ability to mislead RL agents to learn a nefarious policy)

Demonstrates the feasibility of reward poisoning attacks against reinforcement learning agents even in the most challenging black-box setting.

Limitations

Assumes learners follow a no-regret RL algorithm
Assumes the environment is modeled as an episodic Markov Decision Process

Abstract

We study black-box reward poisoning attacks against reinforcement learning (RL), in which an adversary aims to manipulate the rewards to mislead a sequence of RL agents with unknown algorithms to learn a nefarious policy in an environment unknown to the adversary a priori. That is, our attack makes minimum assumptions on the prior knowledge of the adversary: it has no initial knowledge of the environment or the learner, and neither does it observe the learner's internal mechanism except for its performed actions. We design a novel black-box attack, U2, that can provably achieve a near-matching performance to the state-of-the-art white-box attack, demonstrating the feasibility of reward poisoning even in the most challenging black-box setting.

Ask AI

Helpful

Bookmark

View Full Paper

Ask AI

Helpful

Bookmark

View Full Paper

Cite This Study

Rakhsha et al. (Tue,) reported a other. U2 black-box attack vs. White-box attack was evaluated on Attack cost. The U2 black-box reward poisoning attack strategy achieved an attack cost comparable to the optimal white-box attack against unknown reinforcement learning agents in unknown environments.

synapsesocial.com/papers/6a11d0cbed9c06332dfd440e https://doi.org/https://doi.org/10.48550/arxiv.2102.08492

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: