April 6, 2026Open Access

pyFLANK, a graph neural network based null distribution inference model for FST outlier detection

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

Abstract Background Detecting genomic regions under selection is essential for understanding how populations adapt to different environments, yet it remains challenging due to the confounding effects of demographic history and linkage disequilibrium (LD). Fixation index (F ST) is a widely used statistic to identify genomic regions under adaptation. However, identifying genes under selection by defining F ST outliers often remains challenging, owing to confounding effects of underlying demographic history. Traditional methods assume independence among loci and rely on simple demographic models, while newer models perform much better but are computationally expensive and not easily scalable. Results Here, we present pyFLANK, an open-source and automated Python implementation which detects F ST outliers using a null distribution inferred from quasi-independent loci. Our tool integrates three approaches to identify loci obeying a null distribution: graph neural network (GNN) inference, linkage disequilibrium (LD)-based inference, and user-defined input. Because pyFLANK uses GNN-based inference of quasi-independent loci, it yields a more accurate null model with less need for user parameter input. In simulation experiments, pyFLANK achieved lower false positive rates than current methods while maintaining comparable detection power, indicating that its refined null model better distinguishes true adaptive loci from background variation. The GNN-based model, in particular, detected additional loci associated with phenotypic variance that were not identified by existing methods. Conclusions Assessments of simulation and real data from different species demonstrate that pyFLANK achieves lower false positive rates compared with other commonly used F ST outlier detectors, while maintaining comparable detection power and excellent computational performance, providing a robust and user-friendly tool for identifying loci under divergent selection. It extends existing F ST outlier frameworks by incorporating explicit LD-aware strategies for null model calibration. The method is intended as a practical and scalable complement to existing genome scan approaches.

Leer artículo completoexternamente

Me gusta

Guardar

Ver artículo completo

Cite This Study

Zhang et al. (Mon,) studied this question.

synapsesocial.com/papers/6a176bc0aeefdf6d9c128324 https://doi.org/https://doi.org/10.1186/s12859-026-06430-2

Me gusta

Guardar

Ver artículo completo