Key points are not available for this paper at this time.
Comparing treatment effects is a significant aim in epidemiology. One measure commonly used to perform this comparison is the average treatment effect. It aims to quantify the causal relation between a treatment and an outcome. Randomized control trials (RCT) are considered the gold standard for this estimation. Nonetheless, conducting a RCT is not always practical because of cost, time, logistic constraints, etc. For that reason, using observational data is increasingly popular, but this comes with many challenges. One of those challenges is the presence of confounders, which can bias an estimation. Balancing methods were created to alleviate this issue. Recently, several balancing methods have been introduced, but there are currently no real guidelines on which one to use. To this aim, we compare one classical method to three more recent ones using simulated data. To define an estimation method for an average treatment effect using observational data, one must specify the balancing method, the estimator, and the types of regressor, if needed, for both the balancing and estimation tasks. We compare four balancing methods: Inverse Probability of Treatment Weighting (IPTW), Energy Balancing (EB), Kernel Optimal Matching (KOM) and covariates balancing by Tailored Loss Function (TLF); with three classical estimators: average weighting, double robust scheme, linear regression coefficient; using if required three types of regressor: a correctly specified logistic regression, misspecified logistic regression and a random forest; for a total of 24 estimation methods. Those methods are tested on simulated data with 36 scenarios in which sample size, treatment probability, and confounding level vary. Results show that the choice of the balancing method matters as much as the choice of the estimator. The type of regressor had a limited impact when estimating the ATE; it was more significant when estimating the ATT. EB demonstrates the best performance overall, TLF yields poor estimation in most settings, and KOM as well as IPTW give good estimations in simple settings. However, the treatment rarity and the confounding level heavily impact these last two methods. In most scenarios, opting for a double robust estimator proves to be the optimal choice for estimating both ATE and ATT, irrespective of the balancing method employed. Standard methods, like the IPTW with a logistic regression, give good results in settings where the probability of treatment is not too far from 1/2. However, EB with linear regression estimator seems to be the best for settings with rarer treatment according to our simulation.
Peyrot et al. (Wed,) studied this question.