Multilevel modeling is widely used to analyze hierarchically structured data across diverse fields such as education and psychology. Despite the growing use of multilevel models, guidance on calculating and interpreting effect size measures in these models remains limited, particularly for multilevel models with random slopes and those incorporating sampling weights. This dissertation addresses these gaps through a review of existing practices followed by two methodological studies. Chapter 2 provides a brief review of effect size reporting practices in applied multilevel research, based on a survey of four academic journals in education and psychology. It also offers an account of commonly used and recommended effect size measures for multilevel modeling, which are further developed in this dissertation for models with random slopes, an area that remains underdeveloped. The first study, presented in Chapter 3, introduces the predictor standardization approach as a practical alternative to Johnson’s mixture distribution approach for estimating variance components in multilevel models with random slopes. Using mathematical derivations and real data demonstrations, this study illustrates how centering and scaling (the two steps of standardizing) influence variance decomposition in multilevel modeling. More importantly, the study shows how standardizing a predictor makes the computation of effect size measures, such as R² and the standardized mean difference, much more straightforward. The simulation results show that both the predictor standardization approach and Johnson’s approach yield effect sizes close to the true values across various simulated conditions. The second study extends effect size measures for multilevel modeling with random slopes to incorporate sampling weights for population-level inference. Johnson’s approach and the predictor standardization approach are both extended to incorporate sampling weights, and their implementation is explicated in detail. An empirical example illustrates the utility of these two methods. Together, these studies provide methodologically grounded, user-friendly guidance on effect size measures in multilevel modeling, with specific attention to random slope models and population-level inference with sampling weights. The dissertation concludes with practical recommendations and directions for future research to promote effect size reporting in multilevel contexts.
Guanyu Chen (Thu,) studied this question.