This article analyzes the results of the 2024 ADIA Lab Causal Discovery Challenge, a large-scale competition designed to evaluate whether machine learning methods can infer causal roles from observational data when ground-truth causal structures are known. Participants classified variables into eight causal categories—including confounders, mediators, colliders, and causal antecedents or consequences of treatment and outcome variables—using 47,000 synthetic datasets generated from known causal directed acyclic graphs. The Challenge attracted 1,904 registered participants and 3,343 submitted solutions worldwide. Our results show that supervised learning approaches substantially outperformed traditional constraint-based causal discovery methods in this controlled setting. Top-performing solutions achieved multiclass balanced accuracies as high as 77%, compared with approximately 40% for baseline methods. The most successful approaches combined rich feature engineering, edge-centric representations, ensemble learning, and graph-based architectures inspired by graph neural networks. Performance differences across graph structures revealed that some causal environments are systematically more difficult for discovery algorithms than others. The findings suggest that, when large libraries of labeled dataset–graph pairs are available, causal-role recognition can be effectively approached as a supervised learning problem, with potential applications in factor research, macro-financial forecasting, stress testing, and portfolio construction.
Olivetti et al. (Tue,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: