The conservation of gene models can support genome annotation

Abstract

Many genome annotations include false-positive gene models, leading to errors in phylogenetic and comparative studies. Here, we propose a method to support gene model prediction based on evolutionary conservation and use it to identify potentially erroneous annotations. Using this method, we developed a set of 15,345 representative gene models from 12 legume assemblies that can be used to support genome annotations for other legumes.

Temporally gene knockout using heat shock–inducible genome‐editing system in plants

Abstract

Clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated nuclease 9 (Cas9) has emerged as a powerful tool to generate targeted loss-of-function mutations for functional genomic studies. As a next step, tools to generate genome modifications in a spatially and temporally precise manner will enable researchers to further dissect gene function. Here, we present two heat shock–inducible genome-editing (IGE) systems that efficiently edit target genes when the system is induced, thus allowing us to target specific developmental stages. For this conditional editing system, we chose the natural heat-inducible promoter from heat-shock protein 18.2 (HSP18.2) from Arabidopsis thaliana and the synthetic heat–inducible promoter heat shock–response element HSE-COR15A to drive the expression of Cas9. We tested these two IGE systems in Arabidopsis using cyclic or continuous heat-shock treatments at the seedling and bolting stages. A real-time quantitative polymerase chain reaction analysis revealed that the HSP18.2 IGE system exhibited higher Cas9 expression levels than the HSE-COR15A IGE system upon both cyclic and continuous treatments. By targeting brassinosteroid-insensitive 1 (BRI1) and phytoene desaturase (PDS), we demonstrate that both cyclic and continuous heat inductions successfully activated the HSP18.2 IGE system at the two developmental stages, resulting in highly efficient targeted mutagenesis and clear phenotypic outcomes. By contrast, the HSE-COR15A IGE system was only induced at the seedling stage and was less effective than the HSP18.2 IGE system in terms of mutagenesis frequencies. The presented heat shock–IGE systems can be conditionally induced to efficiently inactivate genes at any developmental stage and are uniquely suited for the dissection and systematic characterization of essential genes.

Skim exome capture genotyping in wheat

Abstract

Next-generation sequencing (NGS) technology advancements continue to reduce the cost of high-throughput genome-wide genotyping for breeding and genetics research. Skim sequencing, which surveys the entire genome at low coverage, has become feasible for quantitative trait locus (QTL) mapping and genomic selection in various crops. However, the genome complexity of allopolyploid crops such as wheat (Triticum aestivum L.) still poses a significant challenge for genome-wide genotyping. Targeted sequencing of the protein-coding regions (i.e., exome) reduces sequencing costs compared to whole genome re-sequencing and can be used for marker discovery and genotyping. We developed a method called skim exome capture (SEC) that combines the strengths of these existing technologies and produces targeted genotyping data while decreasing the cost on a per-sample basis compared to traditional exome capture. Specifically, we fragmented genomic DNA using a tagmentation approach, then enriched those fragments for the low-copy genic portion of the genome using commercial wheat exome baits and multiplexed the sequencing at different levels to achieve desired coverage. We demonstrated that for a library of 48 samples, ∼7–8× target coverage was sufficient for high-quality variant detection. For higher multiplexing levels of 528 and 1056 samples per library, we achieved an average coverage of 0.76× and 0.32×, respectively. Combining these lower coverage SEC sequencing data with genotype imputation using a customized wheat practical haplotype graph database that we developed, we identified hundreds of thousands of high-quality genic variants across the genome. The SEC method can be used for high-resolution QTL mapping, genome-wide association studies, genomic selection, and other downstream applications.