The out-of-distribution (OOD) property in data is deemed as one main challenge hindering the generalization ability of machine learning algorithms. However, the underlying reasons for this property remain an intriguing and open question that has yet to be fully understood. In this paper, we seek to enhance our understanding of the OOD phenomenon by framing it as a problem of distribution shift and addressing it through two complementary causal perspectives. The first is a generative causal view that elucidates the data generation process. We introduce a novel three-dimensional coordinate system to represent three fundamental distribution shifts, illustrating their role in various OOD generalization problems. The second is an anti-causal view that focuses on the model learning process. We develop an effective approach dubbed Counterfactual Risk Minimization (CRM) to address arbitrary distribution shifts in a unified framework. Additionally, we introduce a new multidomain visual recognition dataset called CONA to facilitate further exploration of OOD generalization. We conduct evaluations of CRM alongside several state-of-the-art competitors on four benchmark datasets under the three distribution shifts. The results not only affirm CRM's superiority but also shed light on potential future directions.
Yang et al. (Thu,) studied this question.