A chromosome‐level genome of mango exclusively from long‐read sequence data

Abstract

Improvements in long-read sequencing techniques have greatly accelerated plant genome sequencing. Current de novo assemblies are routinely achieved by assembling long-read sequence data into contigs that are assembled to chromosome level by chromatin conformation capture. We report here a chromosome-level mango genome using only PacBio high-fidelity (HiFi) long reads. HiFi reads at high coverage (204x) resulted in the assembly of 17 chromosomes, each as a single contig with telomeres at both ends. The remaining three chromosomes were represented each by two contigs, with telomeres at one end and ribosomal repeats at the other end. Analyzing contig ends allowed them to be paired and linked to generate the remaining three complete chromosomes, telomere-to-telomere but with ribosomal repeats of uncertain length. The assembled genome was 365 Mb with 100% completeness as assessed by Benchmarking Universal Single-Copy Orthologs analysis. The haplotypes assembled demonstrated extensive structural differences. This approach using very high genome coverage may be useful for assembling high-quality genomes for many other plants.