Introduction Genomic Prediction (GP) faces significant challenges in balancing model complexity with computational efficiency, particularly for high-dimensional genomic data under limited sample sizes. Methods We propose GViT-GP, a Vision Transformer architecture that injects the Genomic Relationship Matrix (GRM) as a biological prior via a dual-pathway cross-attention fusion mechanism, coupled with a Selective Patch Embedding strategy to reduce redundancy and improve data efficiency. Results We evaluated GViT-GP on 20 traits across four datasets from three species (soybean, cattle, and chicken). GViT-GP outperformed established linear and non-linear baselines (including GBLUP, LightGBM, and DNNGP), achieving the best accuracy in 16/20 tasks. Ablation studies supported the effectiveness of Selective Patch Embedding and cross-attention fusion, and visualization analyses suggest adaptive attention to informative genomic regions. Discussion These results indicate that injecting GRM-informed inductive bias improves robustness and generalization in “p ≫ n” settings. GViT-GP provides a practical, high-performance framework for capturing complex genotype–phenotype relationships in modern digital breeding.
Li et al. (Sun,) studied this question.