What question did this study set out to answer?

This study aims to explore how sample size affects the reliability of normality tests and their corresponding critical values.

March 2, 2026Open Access

Empirical and Asymptotic Perspectives on Sample Size Adequacy in Normality Testing: A Monte Carlo Study

Key Points

This study aims to explore how sample size affects the reliability of normality tests and their corresponding critical values.
Conducted a large-scale Monte Carlo simulation study for sixteen normality tests.
Evaluated empirical and asymptotic critical values across sample sizes of 25, 50, 100, and 500.
Assessed empirical power at significance levels of 0.05 and 0.10.
Analyzed results across symmetric and asymmetric alternative distributions.
Substantial discrepancies between empirical and asymptotic critical values were found, especially at smaller sample sizes.
Tests showed rapid power gains under symmetric alternatives until reaching saturation, while gains for asymmetric alternatives continued at larger sample sizes.
Increasing significance levels uniformly boosted power without changing relative test rankings.

Abstract

Normality tests are widely used in statistical practice; however, their finite-sample behavior—shaped by the interaction between sample size, critical value calibration, and distributional structure—remains insufficiently understood. This study investigates how sample size governs the reliability of empirical and asymptotic critical values and, in turn, shapes the empirical power of widely used normality tests under symmetric and asymmetric departures from normality. A large-scale Monte Carlo simulation study was conducted for sixteen widely used normality tests. Empirical and asymptotic critical values were evaluated across sample sizes n=25, 50, 100 and 500, together with the asymptotic benchmark. Empirical power was assessed at significance levels α=0.05 and α=0.10, with results summarized by averaging across structurally similar symmetric and asymmetric alternative distributions. Substantial discrepancies between empirical and asymptotic critical values were observed for several tests at small and moderate sample sizes. These discrepancies translated directly into heterogeneous power behavior. Under symmetric alternatives, many tests exhibited rapid power gains up to moderate sample sizes, followed by clear saturation. In contrast, asymmetric alternatives showed delayed power accumulation, with meaningful gains persisting at larger sample sizes. Increasing the significance level increased power uniformly but did not alter relative test rankings. Sample size effects in normality testing are strongly distribution-dependent and cannot be adequately captured by asymptotic theory alone. Moderate samples may suffice for detecting symmetric deviations, whereas asymmetric departures require larger samples to achieve reliable power. These findings underscore the importance of finite-sample considerations in normality testing and provide a mechanistic basis for more informed test selection.

Bookmark

View Full Paper

Cite This Study

Mehmet Tahir Huyut (Sat,) studied this question.

synapsesocial.com/papers/69a52dabf1e85e5c73bf0b72 https://doi.org/https://doi.org/10.21597/jist.1846196

Bookmark

View Full Paper