What question did this study set out to answer?

The aim is to provide researchers with a practical guide for estimating and discovering heterogeneous treatment effects using machine learning techniques.

June 17, 2026

Estimating and discovering heterogeneous treatment effects using machine learning in epidemiological studies: a practical guide

Key Points

The aim is to provide researchers with a practical guide for estimating and discovering heterogeneous treatment effects using machine learning techniques.
Overview of HTE analysis motivations
Methodological overview of machine learning algorithms for CATE estimation
Application example using a national sample of US older adults
Provides statistical codes for effective model implementation
Discusses critical considerations for analyzing HTE with granular CATE
Equips researchers with tools for both randomized controlled trials and observational studies

Abstract

Machine learning-based heterogeneous treatment effect (HTE) estimation and discovery have recently received substantial attention in the healthcare literature. In particular, meta-learner frameworks and causal forests have been widely used in estimating the conditional average treatment effect (CATE). Such advances in HTE estimation and discovery have allowed researchers to assess HTE patterns in their data. Here, we provide a comprehensive and practical guide as well as statistical codes for researchers to implement these models effectively. Specifically, we provide an overview of core motivations for HTE analysis. Then, we describe a methodological overview of popular machine learning algorithms for CATE estimation and how to calibrate their model fit. After demonstrating an application example using a national sample of US older adults, we discuss some critical and practical considerations of HTE analysis with highly granular CATE. Finally, we discuss the assessment of HTE, including the scale and reference point, as well as the interpretation of CATE. Overall, this paper aims to equip researchers with both the conceptual understanding and practical tools necessary to apply machine learning-based HTE analysis in epidemiological research, including both randomized controlled trials and observational studies.

KI fragen

Bookmark