We appreciate the comments by Zhao and Yao regarding our article Comparative Weight Change With Initiation and Adherence to Common Medications for Type 2 Diabetes (doi: 10.1002/oby.70022). Here we provide a point-by-point response that includes clarifying points on our analytic approach (full technical details of which can be found in McGrath et al. 1) and other more general aspects of causal inference for longitudinal data. In their first broad criticism, Zhao and Yao state: “the treatment exposure definition—specifically ‘initiation and adherence’—may introduce a composite exposure misclassification that undermines causal inference.” To clarify, “initiation and adherence” in our context describes the nature of the strategies/interventions that would be compared in our specified target trial 2; that is, this characterizes the causal effect that our analysis was constructed to try to inform. If the intent of the critique is to advocate for a different causal contrast, that is a question of estimand specification rather than a limitation of the analytic approach we used for the specified strategies. The fundamental benefit of a “target trial emulation” with observational data is not simply the use of “sophisticated approaches” but rather the transparency it lends to the causal question of interest, which is foundationally necessary to justify an analytic approach. In clinical practice, adherence to GLP-1RAs and SGLT-2is is often constrained by side effects (e.g., nausea, genitourinary infections) and cost, which selectively affects patient continuation. Patients who remain adherent for 24 months likely differ systematically from those who discontinue early—not only in socioeconomic status but also in metabolic phenotype (e.g., baseline BMI, insulin resistance, and motivation for weight control). By modeling “adherence” as a time-varying covariate rather than a stratified behavioral phenomenon, the study may have inadvertently captured the effect of a “high-adherence phenotype” rather than the pharmacologic weight effect itself. The analytic approach was chosen to address the problem that adherence is, of course, dependent on patient characteristics that are predictive of the outcome (i.e., there is confounding). Further, it is structured to address the reality that adherence is not pre-determined solely by a “phenotype” (i.e., adherence is not a time-fixed phenomenon). Rather adherence is a time-varying process and the reasons that someone adheres or does not at any given time may be affected by the history of outcome risk factors (i.e., there is time-varying confounding). Further, time-varying confounders may, themselves, be affected by past initiation and adherence. It is now well-established that in a complex longitudinal setting such as this, an analysis that simply stratifies the data, comparing groups defined by post-baseline time-varying treatment status or patterns, is generally subject to a “collider”/“selection bias,” even when all relevant measured covariates are further regressed/stratified/conditioned on 3, 4. Importantly this can be true in an observational study or a real trial with non-adherence. Inverse probability weighting is one approach that can validly recover causal effects in settings with time varying confounders affected by treatment. the analytical framework presumes that the recorded baseline and time-varying covariates sufficiently account for confounding by indication; yet, several clinically salient confounders—particularly the temporal evolution of glycemic control—were likely incompletely captured. It is true that our analysis does require that recorded baseline and time-varying covariates are sufficient to control confounding (by indication or otherwise). This limitation is common to all observational studies and even applies to well-conducted randomized trials, where post-baseline non-adherence and missing outcome measures can introduce similar bias 5, 6. Thus, our study made realistic assumptions about the causal structure of imperfect data (including time-varying confounding as discussed above but also sporadically and informatively measured outcomes). More specifically, while this analysis did not adjust for time-varying A1c measures (due to a high degree of missingness), it did adjust for other relevant time-varying measured proxies of glycemic control including new prescriptions for additional diabetes medications (beyond the initiated medication). Further note that this study was limited to patients who had previously received metformin and aimed to emulate a target trial of second-line treatments mitigating confounding by baseline differences in the clinical scenario. Finally, Zhao and Yao state: “the decision to pool individual medications within drug classes may obscure clinically relevant hetereogeneity.” We acknowledge heterogeneity among agents, particularly within GLP-1 receptor agonists, and noted this limitation clearly in the discussion. The study included data from 2010 through 2019. Thus, the study period overlapped minimally with the availability of semaglutide (Ozempic approved in 2017) and not at all with tirzepatide (Mounjaro approved in 2022). Dulaglutide (Trulicity) was the only long-acting GLP-1 RA consistently available for much of the study period (approved in 2014). This decision was based on sample size considerations. Importantly, even in the presence of this heterogeneity, our analysis can be understood as aiming to inform average outcome differences across treatment arms in a trial where the protocol requires initiating and adhering to a specific class in each arm but there is clinician discretion within class 7, 8. Future research should examine individual drugs as data availability improves. Also, we are aware of at least one ongoing randomized trial comparing GLP-1 receptor agonists and SGLT-2s head-to-head, which will hopefully support an even more robust understanding of the comparative effects of these medications on weight change. We absolutely agree that issues related to tolerability and cost are often pragmatic considerations which should drive treatment decisions. The authors have no conflicts of interest to report. Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.
YOUNG et al. (Fri,) studied this question.