In cluster-randomized trials (CRTs), missing data can occur in various ways, including missing values in outcomes and baseline covariates at the individual or cluster level, or completely missing information for non-participants. Among the various types of missing data in CRTs, missing outcomes have attracted the most attention. However, no existing methods simultaneously address all aforementioned types of missing data in CRTs. To fill in this gap, we propose a doubly robust estimator for the average treatment effect on a variety of effect measure scales. The proposed estimator simultaneously handles missing outcomes under missingness at random, missing covariates without constraining the missingness mechanism, and missing cluster-population sizes via a uniform sampling mechanism. Furthermore, we detail key considerations to improve precision by specifying the optimal weights, leveraging machine learning, and modeling the treatment assignment mechanism. Finally, to evaluate the impact of violating missing data assumptions, we contribute a new sensitivity analysis framework tailored to CRTs. We assess the performance of the proposed methods through simulations and illustrate their use in a real data application.
Wang et al. (Tue,) studied this question.