Abstract In genetic association studies, permutation tests serve as a cornerstone to estimate p-values. This is because researchers may design new test statistics without a known closed-form distribution, or the assumption of a well-established test may not hold. However, permutation tests require a vast number of permutations which is proportional to the magnitude of the actual p-values. When it comes to genome-wide association studies where multiple-test corrections are routinely conducted, the actual p-values are extremely small, requiring a daunting number of permutations that may be beyond the available computational resources. Existing models that reduce the required number of permutations all assume a specific format of the test statistic to exploit its specific statistical properties. We propose Kernel-smoothed permutation which is a model-free method universally applicable to any statistic. Our tool forms the null distribution of test statistics using a kurtosis-driven transformation, followed by a kernel-based density estimation (KDE). We compared our Kernel-smoothed permutation to Naïve permutation using statistics from known closed-form null distributions. Based on three frequently used test statistics in association studies, i.e., t-test, sequence kernel association test (SKAT), and chi-squared test, we demonstrated that our model reduced the required number of permutations by a magnitude with similar or higher accuracy. Based on a real-world genome-wide association study (GWAS) analysis, we used Crohn’s disease cohort to further confirm that our model substantially outperforms the Naïve permutation.
Bian et al. (Mon,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: