What question did this study set out to answer?

The research aims to develop an algorithm for identifying optimal treatment policies based on observational data.

synapse

⌘+K

synapse

⌘+K

February 28, 2026Open Access

Adaptive welfare maximization

Key Points

The research aims to develop an algorithm for identifying optimal treatment policies based on observational data.
Combines doubly robust welfare estimation with sample splitting
Handles complex covariates and unknown propensity scores
Assesses policy complexity adaptively
Achieves minimax-optimal rate of convergence in expected regret
Selects policy complexity with nearly oracle performance
Outperforms traditional methods in simulation

Abstract

Abstract We consider the problem of learning optimal treatment policies from observational data. We propose an algorithm that combines doubly robust welfare estimation, to accommodate rich covariates and unknown propensity scores, and sample splitting, to adaptively select policy complexity. We show that the resulting treatment rule achieves the minimax-optimal rate of convergence in expected regret while selecting a suitable policy complexity with nearly oracle performance. Our analysis avoids unnecessarily restrictive assumptions commonly imposed on the data-generating process or on first-stage nonparametric estimators and yields a sharp characterization of the relevant universal constants. The practical performance of the proposed method is demonstrated in a simulation study.

Mark Helpful

Bookmark

Relay

View Full Paper