Measurement invariance (MI) is a prerequisite for the meaningful and valid comparison of test scores across individuals with different group membership. Given that tests are often used in high-stakes contexts (e.g., diagnosis), the practical impact of violations of MI is of great interest to researchers and practitioners alike. Existing approaches to evaluating the practical impact of noninvariance on selection or classification accuracy have mostly considered MI across two groups. When a population is made up of multiple subpopulations (e.g., ethnic groups), groups are often dichotomized for ease of analysis, which may lead to misleading inferences due to the loss of information and precision. The current paper introduces a general framework for investigating the practical impact of measurement noninvariance on the accuracy and fairness of decisions made using a test administered to individuals from any number of subpopulations. We demonstrate the application and the advantages of the multi-group multidimensional classification accuracy analysis (MMCAA) framework through an illustrative example on the MI of a depression scale across four ethnic groups using a national dataset, showing that valuable information is lost if the grouping variable is collapsed. We offer guidelines for interpretation. The MMCAA framework is fully automated in the R package unbiasr.
Özcan et al. (Fri,) studied this question.