Chromosome‐scale assembly and analysis of Melilotus officinalis genome for SSR development and nodulation genes analysis

Abstract

Melilotus officinalis is an important legume crop with forage and Chinese medicinal value. The unknown genome of M. officinalis restricted the domestication and utilization of the species and its germplasm resource diversity. A chromosome-scale assembly of the M. officinalis genome was assembled and analysed. The 976.27 Mb of genome was divided into eight chromosomes covering 99.16% of the whole genome. A total of 50022 genes were predicted in the genome. M. officinalis and Melilotus albus shared a common ancestor 0.5–5.65 million years ago (MYA). A genome-wide doubling event occurred 68.93 MYA according to the synonymous nucleotide-substitution values. A total of 552102 tandem repeats were predicted, and 46004 SSR primers of TRs with 10 or more base pairs were developed and designed. The elucidation of the M. officinalis genome provides a compelling model system for studying the genetic, evolutionary and biosynthesis of this legume.

All families of transposable elements were active in the recent wheat genome evolution and polyploidy had no impact on their activity

Abstract

Bread wheat (Triticum aestivum L.) is a major crop and its genome is one of the largest ever assembled at reference-quality level. It is 15 Gb, hexaploid, with 85% of transposable elements (TEs). Wheat genetic diversity was mainly focused on genes and little is known about the extent of genomic variability affecting TEs, transposition rate, and the impact of polyploidy. Multiple chromosome-scale assemblies are now available for bread wheat and for its tetraploid and diploid wild relatives. In this study, we computed base pair-resolved, gene-anchored, whole genome alignments of A, B, and D lineages at different ploidy levels in order to estimate the variability that affects the TE space. We used assembled genomes of 13 T. aestivum cultivars (6x = AABBDD) and a single genome for Triticum durum (4x = AABB), Triticum dicoccoides (4x = AABB), Triticum urartu (2x = AA), and Aegilops tauschii (2x = DD). We show that 5%–34% of the TE fraction is variable, depending on the species divergence. Between 400 and 13,000 novel TE insertions per subgenome were detected. We found lineage-specific insertions for nearly all TE families in di-, tetra-, and hexaploids. No burst of transposition was observed and polyploidization did not trigger any boost of transposition. This study challenges the prevailing idea of wheat TE dynamics and is more in agreement with an equilibrium model of evolution.

Use of genomic prediction to screen sorghum B‐lines in hybrid testcrosses

Abstract

Use of trifluoromethanesulfonamide (TFMSA), a male gametocide, increases the opportunities to identify promising B-lines because large quantities of F1 seed can be generated prior to the laborious task of B-line sterilization. Combining TFMSA technology with genomic selection could efficiently evaluate sorghum B-lines in hybrid combination to maximize the rates of genetic gain of the crop. This study used two recombinant inbred B-line populations, consisting of 217 lines, which were testcrossed to two R-lines to produce 434 hybrids. Each population of testcross hybrids were evaluated across five environments. Population-based genomic prediction models were assessed across environments using three different cross-validation (CV) schemes, each with 70% training and 30% validation sets. The validation schemes were as follows: CV1—hybrids chosen randomly for validation; CV2—B-lines were randomly chosen, and each chosen B-line had one of the two corresponding testcross hybrids randomly chosen for the validation; and CV3—B-lines were randomly chosen, and each chosen B-line had both corresponding testcross hybrids chosen for the validation. CV1 and CV2 presented the highest prediction accuracies; nonetheless, the prediction accuracies of the CV schemes were not statistically different in many environments. We determined that combining the B-line populations could improve prediction accuracies, and the genomic prediction models were able to effectively rank the poorest 70% of hybrids even when genomic prediction accuracies themselves were low. Results indicate that combining genomic prediction models and TFMSA technology can effectively aid breeders in predicting B-line hybrid performance in early generations prior to the laborious task of generating A/B-line pairs.

Core Ideas

Genomic prediction can be used to screen sorghum B-lines for hybrid grain yield and days to mid-anthesis. Using genomic prediction and the chemical gametocide TFMSA can increase the rate of genetic gain in sorghum B-lines. Using testers to screen sorghum B-line populations is an effective method for screening with genomic prediction. Genomic prediction can effectively predict hybrid performance within and across populations of sorghum B-lines. The ability to accurately rank hybrid performance remained relatively consistent regardless of prediction accuracy.

Salinity stress tolerance prediction for biomass‐related traits in maize (Zea mays L.) using genome‐wide markers

Abstract

Maize (Zea mays L.) is the third most important cereal crop after rice (Oryza sativa) and wheat (Triticum aestivum). Salinity stress significantly affects vegetative biomass and grain yield and, therefore, reduces the food and silage productivity of maize. Selecting salt-tolerant genotypes is a cumbersome and time-consuming process that requires meticulous phenotyping. To predict salt tolerance in maize, we estimated breeding values for four biomass-related traits, including shoot length, shoot weight, root length, and root weight under salt-stressed and controlled conditions. A five-fold cross-validation method was used to select the best model among genomic best linear unbiased prediction (GBLUP), ridge-regression BLUP (rrBLUP), extended GBLUP, Bayesian Lasso, Bayesian ridge regression, BayesA, BayesB, and BayesC. Examination of the effect of different marker densities on prediction accuracy revealed that a set of low-density single nucleotide polymorphisms obtained through filtering based on a combination of analysis of variance and linkage disequilibrium provided the best prediction accuracy for all the traits. The average prediction accuracy in cross-validations ranged from 0.46 to 0.77 across the four derived traits. The GBLUP, rrBLUP, and all Bayesian models except BayesB demonstrated comparable levels of prediction accuracy that were superior to the other modeling approaches. These findings provide a roadmap for the deployment and optimization of genomic selection in breeding for salt tolerance in maize.

Whole‐genome versus per‐chromosome targeted recombination: Simulations and predicted gains in maize with an integer programming model

Abstract

Per-chromosome targeted recombination, with one to two recombinations at specific marker intervals on each chromosome, doubles the predicted genetic gains in biparental populations. We developed an integer programing model to identify where a fixed number of targeted recombinations should occur across the whole genome, without restrictions on the number of targeted recombinations on each chromosome. We compared whole-genome and per-chromosome targeted recombination in 392 biparental maize (Zea mays L.) populations and in simulation experiments. For yield, moisture, test weight, and a simulated trait controlled by 2000 quantitative trait loci (QTL), predicted gains were 8%–9% larger with 10 targeted recombinations across the entire genome than with one targeted recombination on each of the 10 chromosomes. With whole-genome targeted recombination, the number of recombinations on a given chromosome was correlated (r = 0.76–0.91) with the chromosome size (in cM). Simulation results suggested that previous results on gains from targeted recombination relative to nontargeted recombination were too optimistic by around 20%. Because the underlying QTL are unknown, studies on targeted recombination have relied on genomewide marker effects as proxies for QTL information. The simulation results indicated a 25% (for 10 recombinations) to 33% (for 20 recombinations) reduction in response due to the use of genomewide marker effects as proxies for QTL information. Overall, the results indicated that the integer programming model we developed is useful for increasing both the predicted and true gains from targeted recombination, but the predicted gains are likely to overestimate the true gains.

Prospects for developing allergen‐depleted food crops

Abstract

In addition to the challenge of meeting global demand for food production, there are increasing concerns about food safety and the need to protect consumer health from the negative effects of foodborne allergies. Certain bio-molecules (usually proteins) present in food can act as allergens that trigger unusual immunological reactions, with potentially life-threatening consequences. The relentless working lifestyles of the modern era often incorporate poor eating habits that include readymade prepackaged and processed foods, which contain additives such as peanuts, tree nuts, wheat, and soy-based products, rather than traditional home cooking. Of the predominant allergenic foods (soybean, wheat, fish, peanut, shellfish, tree nuts, eggs, and milk), peanuts (Arachis hypogaea) are the best characterized source of allergens, followed by tree nuts (Juglans regia, Prunus amygdalus, Corylus avellana, Carya illinoinensis, Anacardium occidentale, Pistacia vera, Bertholletia excels), wheat (Triticum aestivum), soybeans (Glycine max), and kidney beans (Phaseolus vulgaris). The prevalence of food allergies has risen significantly in recent years including chance of accidental exposure to such foods. In contrast, the standards of detection, diagnosis, and cure have not kept pace and unfortunately are often suboptimal. In this review, we mainly focus on the prevalence of allergies associated with peanut, tree nuts, wheat, soybean, and kidney bean, highlighting their physiological properties and functions as well as considering research directions for tailoring allergen gene expression. In particular, we discuss how recent advances in molecular breeding, genetic engineering, and genome editing can be used to develop potential low allergen food crops that protect consumer health.

Genome‐wide association of dry (Tamar) date palm fruit color

Abstract

Date palm (Phoenix dactylifera) fruit (dates) are an economically and culturally significant crop in the Middle East and North Africa. There are hundreds of different commercial cultivars producing dates with distinctive shapes, colors, and sizes. Genetic studies of some date palm traits have been performed, including sex determination, sugar content, and fresh fruit color. In this study, we used genome sequences and image data of 199 dry dates (Tamar) collected from 14 countries to identify genetic loci associated with the color of this fruit stage. Here, we find loci across multiple linkage groups (LG) associated with dry fruit color phenotype. We recover both the previously identified VIRESCENS (VIR) genotype associated with fresh fruit yellow or red color and new associations with the lightness and darkness of dry fruit. This study will add resolution to our understanding of date color phenotype, especially at the most commercially important Tamar stage.

Skim exome capture genotyping in wheat

Abstract

Next-generation sequencing (NGS) technology advancements continue to reduce the cost of high-throughput genome-wide genotyping for breeding and genetics research. Skim sequencing, which surveys the entire genome at low coverage, has become feasible for quantitative trait locus (QTL) mapping and genomic selection in various crops. However, the genome complexity of allopolyploid crops such as wheat (Triticum aestivum L.) still poses a significant challenge for genome-wide genotyping. Targeted sequencing of the protein-coding regions (i.e., exome) reduces sequencing costs compared to whole genome re-sequencing and can be used for marker discovery and genotyping. We developed a method called skim exome capture (SEC) that combines the strengths of these existing technologies and produces targeted genotyping data while decreasing the cost on a per-sample basis compared to traditional exome capture. Specifically, we fragmented genomic DNA using a tagmentation approach, then enriched those fragments for the low-copy genic portion of the genome using commercial wheat exome baits and multiplexed the sequencing at different levels to achieve desired coverage. We demonstrated that for a library of 48 samples, ∼7–8× target coverage was sufficient for high-quality variant detection. For higher multiplexing levels of 528 and 1056 samples per library, we achieved an average coverage of 0.76× and 0.32×, respectively. Combining these lower coverage SEC sequencing data with genotype imputation using a customized wheat practical haplotype graph database that we developed, we identified hundreds of thousands of high-quality genic variants across the genome. The SEC method can be used for high-resolution QTL mapping, genome-wide association studies, genomic selection, and other downstream applications.

Integrating de novo QTL‐seq and linkage mapping to identify quantitative trait loci conditioning physiological resistance and avoidance to white mold disease in dry bean

Abstract

White mold (WM), caused by the ubiquitous fungus Sclerotinia sclerotiorum, is a devastating disease that limits production and quality of dry bean globally. In the present study, classic linkage mapping combined with QTL-seq were employed in two recombinant inbred line (RIL) populations, “Montrose”/I9365-25 (M25) and “Raven”/I9365-31 (R31), with the initial goal of fine-mapping QTL WM5.4 and WM7.5 that condition WM resistance. The RILs were phenotyped for WM reactions under greenhouse (straw test) and field environments. The general region of WM5.4 and WM7.5 were reconfirmed with both mapping strategies within each population. Combining the results from both mapping strategies, WM5.4 was delimited to a 22.60–36.25 Mb interval in the heterochromatic regions on Pv05, while WM7.5 was narrowed to a 0.83 Mb (3.99–4.82 Mb) region on the Pv07 chromosome. Furthermore, additional QTL WM2.2a (3.81–7.24 Mb), WM2.2b (11.18–17.37 Mb, heterochromatic region), and WM2.2c (23.33–25.94 Mb) were mapped to a narrowed genomic interval on Pv02 and WM4.2 in a 0.89 Mb physical interval at the distal end of Pv04 chromosome. Gene models encoding gibberellin 2-oxidase proteins regulating plant architecture are likely candidate genes associated with WM2.2a resistance. Nine gene models encoding a disease resistance protein (quinone reductase family protein and ATWRKY69) found within the WM5.4 QTL interval are putative candidate genes. Clusters of 13 and 5 copies of gene models encoding cysteine-rich receptor-like kinase and receptor-like protein kinase-related family proteins, respectively, are potential candidate genes associated with WM7.5 resistance and most likely trigger physiological resistance to WM. Acquired knowledge of the narrowed major QTL intervals, flanking markers, and candidate genes provides promising opportunities to develop functional molecular markers to implement marker-assisted selection for WM resistant dry bean cultivars.