What question did this study set out to answer?

The study aims to evaluate the impact of hyperparameters on prediction stability and performance in wheat breeding using machine learning.

May 25, 2026Open Access

Multi-metric Evaluation and Parametric Optimization of Stochastic Gradient Boosting Machines for Genomic Prediction and Selection in Wheat ( Triticum aestivum ) Breeding

Key Points

The study aims to evaluate the impact of hyperparameters on prediction stability and performance in wheat breeding using machine learning.
Evaluated 36 combinations of gradient boosting machine hyperparameters: learning rate and boosting rounds.
Assessed four agronomic traits using five performance metrics: Pearson's r, AUC, NDCG, ICC, and Fleiss' κ.
Conducted benchmark comparisons with standard rrBLUP methods.
A low learning rate with a high number of boosting rounds achieved ICC > 0.98 for prediction stability.
Improved predictive accuracy (r) and classification accuracy (AUC) while maintaining NDCG > 0.85 for ranking efficiency.
GBM configurations showed comparable performance to rrBLUP with modest trait-dependent differences.

Abstract

Machine learning (ML) models with stochastic and non-deterministic characteristics are increasingly used for genomic prediction in plant breeding, but evaluation often neglects important aspects like prediction stability and ranking performance. This study addresses this gap by evaluating how two hyperparameters of a Gradient Boosting Machine (GBM), learning rate (v) and boosting rounds (ntrees), impact stability and multi-metric predictive performance for cross-season, cross-environment prediction in a MAGIC wheat population. Using a grid search of 36 parameter combinations, we evaluated four agronomic traits with five metrics: Pearson's r, Area Under the Curve (AUC), Normalized Discounted Cumulative Gain (NDCG), and the Intraclass Correlation Coefficient (ICC) and Fleiss' κ for stability. Our findings show that a low learning rate combined with a high number of boosting rounds substantially improves prediction stability (ICC > 0.98) and selection stability (Fleiss' κ > 0.80), while reducing train-test performance gaps. This combination produced concurrent improvements for predictive accuracy (r), classification accuracy (AUC) and ranking efficiency (NDCG), though optimal settings were trait-dependent. Despite moderate Pearson's r in this challenging cross-season, cross-environment prediction scenario, NDCG remained high (> 0.85), indicating strong ability to rank top-performing entries. In benchmark comparisons conducted within this stump-based additive GBM setting, selected GBM configurations were broadly comparable to rrBLUP, with modest trait-dependent differences across metrics. Ultimately, prioritizing stability when tuning GBMs effectively yields reproducible cross-environment predictions with improved accuracy and top-end ranking performance.

Multi-metric Evaluation and Parametric Optimization of Stochastic Gradient Boosting Machines for Genomic Prediction and Selection in Wheat ( Triticum aestivum ) Breeding

Key Points

Abstract

Cite This Study