February 8, 2024Open Access

Differentially Private Model-Based Offline Reinforcement Learning

Key Points

Key points are not available for this paper at this time.

Abstract

We address offline reinforcement learning with privacy guarantees, where the goal is to train a policy that is differentially private with respect to individual trajectories in the dataset. To achieve this, we introduce DP-MORL, an MBRL algorithm coming with differential privacy guarantees. A private model of the environment is first learned from offline data using DP-FedAvg, a training method for neural networks that provides differential privacy guarantees at the trajectory level. Then, we use model-based policy optimization to derive a policy from the (penalized) private model, without any further interaction with the system or access to the input data. We empirically show that DP-MORL enables the training of private RL agents from offline data and we furthermore outline the price of privacy in this setting.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Rio et al. (Thu,) studied this question.

synapsesocial.com/papers/68e7b940b6db64358770fb5f https://doi.org/https://doi.org/10.48550/arxiv.2402.05525

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Bookmark

View Full Paper