August 14, 2025

Pulling back the curtain: the road from statistical estimand to machine-learning based estimator for epidemiologists (no wizard required)

Key Points

Machine learning based estimators improve causal inference in epidemiology by incorporating flexible model assumptions, enhancing accuracy.
The paper derives estimators using efficient influence functions, confirming their validity for incorporating machine learning techniques.
Observational analysis integrates principles of statistical inference with machine learning, ensuring new research questions are answered effectively.
Understanding these methods supports epidemiologists in their role of translating complex research issues into quantifiable estimates.

Abstract

Abstract Epidemiologists increasingly use causal inference methods that rely on machine learning, as these approaches can relax unnecessary model specification assumptions. While deriving and studying asymptotic properties of such estimators is a task usually associated with statisticians, it is useful for epidemiologists to understand the steps involved, as epidemiologists are often at the forefront of defining important new research questions and translating them into new parameters to be estimated. In this paper, our goal was to provide a relatively accessible guide through the process of (i) deriving an estimator based on the so-called efficient influence function (which we define and explain), and (ii) showing such an estimator’s ability to validly incorporate machine learning, by demonstrating the so-called rate double robustness property. The derivations in this paper rely mainly on algebra and some foundational results from statistical inference, which are explained.

Ask AI

Helpful

Bookmark