Normality tests are widely used in statistical practice; however, their finite-sample behavior—shaped by the interaction between sample size, critical value calibration, and distributional structure—remains insufficiently understood. This study investigates how sample size governs the reliability of empirical and asymptotic critical values and, in turn, shapes the empirical power of widely used normality tests under symmetric and asymmetric departures from normality. A large-scale Monte Carlo simulation study was conducted for sixteen widely used normality tests. Empirical and asymptotic critical values were evaluated across sample sizes n=25, 50, 100 and 500, together with the asymptotic benchmark. Empirical power was assessed at significance levels α=0.05 and α=0.10, with results summarized by averaging across structurally similar symmetric and asymmetric alternative distributions. Substantial discrepancies between empirical and asymptotic critical values were observed for several tests at small and moderate sample sizes. These discrepancies translated directly into heterogeneous power behavior. Under symmetric alternatives, many tests exhibited rapid power gains up to moderate sample sizes, followed by clear saturation. In contrast, asymmetric alternatives showed delayed power accumulation, with meaningful gains persisting at larger sample sizes. Increasing the significance level increased power uniformly but did not alter relative test rankings. Sample size effects in normality testing are strongly distribution-dependent and cannot be adequately captured by asymptotic theory alone. Moderate samples may suffice for detecting symmetric deviations, whereas asymmetric departures require larger samples to achieve reliable power. These findings underscore the importance of finite-sample considerations in normality testing and provide a mechanistic basis for more informed test selection.
Mehmet Tahir Huyut (Sat,) studied this question.