Key points are not available for this paper at this time.
Genomic selection (GS) has become an increasingly popular tool in plant breeding programs, propelled by declining genotyping costs, an increase in computational power, and rediscovery of the best linear unbiased prediction methodology over the past two decades. This development has led to an accumulation of extensive historical datasets with genotypic and phenotypic information, triggering the question of how to best utilize these datasets. Here, we investigate whether all available data or a subset should be used to calibrate GS models for across-year predictions in a 7-year dataset of a commercial hybrid sunflower breeding program. We employed a multi-objective optimization approach to determine the ideal years to include in the training set (TRS). Next, for a given combination of TRS years, we further optimized the TRS size and its genetic composition. We developed the MinGRM size optimization method which consistently found the optimal TRS size, reducing dimensionality by 20% with an approximately 1% loss in predictive ability. Additionally, the TailsGEGVs algorithm displayed potential, outperforming the use of all data by using just 60% of it for grain yield, a high-complexity, low-heritability trait. Moreover, maximizing the genetic diversity of the TRS resulted in a consistent predictive ability across the entire range of genotypic values in the test set. Interestingly, the TailsGEGVs algorithm, due to its ability to leverage heterogeneity, enhanced predictive performance for key hybrids with extreme genotypic values. Our study provides new insights into the optimal utilization of historical data in plant breeding programs, resulting in improved GS model predictive ability.
Building similarity graph...
Analyzing shared references across papers
Loading...
Javier Fernández-Gónzalez
Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria
Bertrand Haquin
AGroecologies, Innovations & Ruralities
Eliette Combes
Limagrain (France)
Plant Methods
Universidad Politécnica de Madrid
Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria
Centre for Plant Biotechnology and Genomics
Building similarity graph...
Analyzing shared references across papers
Loading...
Fernández-Gónzalez et al. (Sat,) studied this question.
synapsesocial.com/papers/68e73b96b6db6435876b50e8 — DOI: https://doi.org/10.1186/s13007-024-01151-0