AI models for embryo selection often rely on correlational metrics that ignore clinical confounders. We present a target trial emulation framework to approximate causal effects for a foundational AI model, FEMI, for non-invasive embryo assessment using multi-center trial emulation (n = 4674). Propensity score matching established a robust association between FEMI-Ploidy and implantation failure showing an average treatment effect (ATE) of −0.131 (95% CI −0.196, −0.066) in the development cohort and −0.157 (95% CI −0.254, −0.054) in the external cohort. Comparative efficacy using S-Learner models demonstrated that a high-risk FEMI score carried a significantly stronger individual treatment effect (ITE) penalty on implantation compared to other scoring mechanisms (p < 0.0001). This superiority persisted after adjusting for maternal age, suggesting FEMI captures unique biological features. This causal framework establishes a rigorous standard for AI validation in IVF, providing the necessary pre-clinical justification for prospective randomized controlled trials.
Rajendran et al. (Thu,) studied this question.