Genomic selection of soybean (Glycine max) for genetic improvement of yield and seed composition in a breeding context

Abstract

Genomic selection has been utilized for genetic improvement in both plant and animal breeding and is a favorable technique for quantitative trait development. Within this study, genomic selection was evaluated within a breeding program, using novel validation methods in addition to plant materials and data from a commercial soybean (Glycine max) breeding program. A total of 1501 inbred lines were used to test multiple genomic selection models for multiple traits. Validation included cross-validation, inter-environment, and empirical validation. The results indicated that the extended genomic best linear unbiased prediction (EGBLUP) model was the most effective model tested for yield, protein, and oil in cross-validation with accuracies of 0.50, 0.68, and 0.64, respectively. Increasing marker number from 1000 to 3000 to 6000 single nucleotide polymorphism markers leads to statistically significant increases in accuracy. Cross-environment predictions were statistically lower than cross-validation with accuracies of 0.24, 0.54, and 0.42 for yield, protein, and oil, respectively, using the extended genomic BLUP model. Empirical validation, predicting the yield of 510 soybean lines, had a prediction accuracy of 0.34, with the inclusion of a maturity covariate leading to a notable increase in accuracy. Genomic selection identified high-performance lines in inter-environment predictions: 34% of lines within the upper quartile of yield, and 51% and 48% of the highest quartile protein and oil lines, respectively. Statistically similar results occurred comparing rankings in empirical validation and selection for advancements in yield trials. These results indicate that genomic selection is a useful tool for selection decisions.