Bread wheat (Triticum aestivum) is a major food crop and an important plant system for agricultural genetics research. However, due to the complexity and size of its allohexaploid genome, genomic resources are limited compared to other major crops. The IWGSC recently published a reference genome and associated annotation (IWGSC CS v1.0, Chinese Spring) that has been widely adopted and utilized by the wheat community. Although this reference assembly represents all three wheat subgenomes at chromosome-scale, it was derived from short reads, and thus is missing a substantial portion of the expected 16 Gbp of genomic sequence. We earlier published an independent wheat assembly (Triticum_aestivum_3.1, Chinese Spring) that came much closer in length to the expected genome size, although it was only a contig-level assembly lacking gene annotations. Here, we describe a reference-guided effort to scaffold those contigs into chromosome-length pseudomolecules, add in any missing sequence that was unique to the IWGSC CS v1.0 assembly, and annotate the resulting pseudomolecules with genes. Our updated assembly, Triticum_aestivum_4.0, contains 15.07 Gbp of non-gap sequence anchored to chromosomes, which is 1.2 Gbps more than the previous reference assembly. It includes 108,639 genes unambiguously localized to chromosomes, including over 2,000 genes that were previously unplaced. We also discovered more than 5,700 additional gene copies, facilitating the accurate annotation of functional gene duplications including at the Ppd-B1 photoperiod response locus.
more »
« less
This content will become publicly available on March 7, 2026
Aegilops tauschii genome assembly v6.0 with improved sequence contiguity differentiates assembly errors from genuine differences with the D subgenome of Chinese Spring wheat assembly IWGSC RefSeq v2.1
Abstract Aegilops tauschii is the donor of the D subgenome of hexaploid wheat and a valuable genetic resource for wheat improvement. Several reference-quality genome sequences have been reported for A. tauschii accession AL8/78. A new genome sequence assembly (Aet v6.0) built from long Pacific Biosciences HiFi reads and employing an optical genome map constructed with a new technology is reported here for this accession. The N50 contig length of 31.81 Mb greatly exceeded that of the previous AL8/78 genome sequence assembly (Aet v5.0). Of 1,254 super-scaffolds, 92, comprising 98% of the total super-scaffold length, were anchored on a high-resolution genetic map, and pseudomolecules were assembled. The number of gaps in the pseudomolecules was reduced from 52,910 in Aet v5.0 to 351 in Aet v6.0. Gene models were transferred from the Aet v5.0 assembly into the Aet v6.0 assembly. A total of 40,447 putative orthologous gene pairs were identified between the Aet v6.0 and Chinese Spring wheat IWGSC RefSer v2.1 D-subgenome pseudomolecules. Orthologous gene pairs were used to compare the structure of the A. tauschii and wheat D-subgenome pseudomolecules. A total of 223 structural differences were identified. They included 44 large differences in sequence orientation and 25 differences in sequence location. A technique for discriminating between assembly errors and real structural variation between closely related genomes is suggested.
more »
« less
- Award ID(s):
- 2102953
- PAR ID:
- 10626806
- Editor(s):
- Sachs, M
- Publisher / Repository:
- Oxford Academic
- Date Published:
- Journal Name:
- G3: Genes, Genomes, Genetics
- Volume:
- 15
- Issue:
- 5
- ISSN:
- 2160-1836
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
SUMMARY Aegilopsspecies represent the most important gene pool for breeding bread wheat (Triticum aestivum). Thus, understanding the genome evolution, including chromosomal structural rearrangements and syntenic relationships amongAegilopsspecies or betweenAegilopsand wheat, is important for both basic genome research and practical breeding applications. In the present study, we attempted to develop subgenome D‐specific fluorescencein situhybridization (FISH) probes by selecting D‐specific oligonucleotides based on the reference genome of Chinese Spring. The oligo‐based chromosome painting probes consisted of approximately 26 000 oligos per chromosome and their specificity was confirmed in both diploid and polyploid species containing the D subgenome. Two previously reported translocations involving two D chromosomes have been confirmed in wheat varieties and their derived lines. We demonstrate that the oligo painting probes can be used not only to identify the translocations involving D subgenome chromosomes, but also to determine the precise positions of chromosomal breakpoints. Chromosome painting of 56 accessions ofAe. tauschiifrom different origins led us to identify two novel translocations: a reciprocal 3D‐7D translocation in two accessions and a complex 4D‐5D‐7D translocation in one accession. Painting probes were also used to analyze chromosomes from more diverseAegilopsspecies. These probes produced FISH signals in four different genomes. Chromosome rearrangements were identified inAegilops umbellulata,Aegilops markgrafii, andAegilops uniaristata, thus providing syntenic information that will be valuable for the application of these wild species in wheat breeding.more » « less
-
Pyhäjärvi, T (Ed.)Abstract Blackberries (Rubus spp.) are the fourth most economically important berry crop worldwide. Genome assemblies and annotations have been developed for Rubus species in subgenus Idaeobatus, including black raspberry (R. occidentalis), red raspberry (R. idaeus), and R. chingii, but very few genomic resources exist for blackberries and their relatives in subgenus Rubus. Here we present a chromosome-length assembly and annotation of the diploid blackberry germplasm accession “Hillquist” (R. argutus). “Hillquist” is the only known source of primocane-fruiting (annual-fruiting) in tetraploid fresh-market blackberry breeding programs and is represented in the pedigree of many important cultivars worldwide. The “Hillquist” assembly, generated using Pacific Biosciences long reads scaffolded with high-throughput chromosome conformation capture sequencing, consisted of 298 Mb, of which 270 Mb (90%) was placed on 7 chromosome-length scaffolds with an average length of 38.6 Mb. Approximately 52.8% of the genome was composed of repetitive elements. The genome sequence was highly collinear with a novel maternal haplotype-resolved linkage map of the tetraploid blackberry selection A-2551TN and genome assemblies of R. chingii and red raspberry. A total of 38,503 protein-coding genes were predicted, of which 72% were functionally annotated. Eighteen flowering gene homologs within a previously mapped locus aligning to an 11.2 Mb region on chromosome Ra02 were identified as potential candidate genes for primocane-fruiting. The utility of the “Hillquist” genome has been demonstrated here by the development of the first genotyping-by-sequencing-based linkage map of tetraploid blackberry and the identification of possible candidate genes for primocane-fruiting. This chromosome-length assembly will facilitate future studies in Rubus biology, genetics, and genomics and strengthen applied breeding programs.more » « less
-
de los Campos, G (Ed.)Abstract De novo genome assembly is essential for genomic research. High-quality genomes assembled into phased pseudomolecules are challenging to produce and often contain assembly errors because of repeats, heterozygosity, or the chosen assembly strategy. Although algorithms that produce partially phased assemblies exist, haploid draft assemblies that may lack biological information remain favored because they are easier to generate and use. We developed HaploSync, a suite of tools that produces fully phased, chromosome-scale diploid genome assemblies, and performs extensive quality control to limit assembly artifacts. HaploSync scaffolds sequences from a draft diploid assembly into phased pseudomolecules guided by a genetic map and/or the genome of a closely related species. HaploSync generates a report that visualizes the relationships between current and legacy sequences, for both haplotypes, and displays their gene and marker content. This quality control helps the user identify misassemblies and guides Haplosync’s correction of scaffolding errors. Finally, HaploSync fills assembly gaps with unplaced sequences and resolves collapsed homozygous regions. In a series of plant, fungal, and animal kingdom case studies, we demonstrate that HaploSync efficiently increases the assembly contiguity of phased chromosomes, improves completeness by filling gaps, corrects scaffolding, and correctly phases highly heterozygous, complex regions.more » « less
-
Abstract Background The release of the first reference genome of walnut (Juglans regia L.) enabled many achievements in the characterization of walnut genetic and functional variation. However, it is highly fragmented, preventing the integration of genetic, transcriptomic, and proteomic information to fully elucidate walnut biological processes. Findings Here, we report the new chromosome-scale assembly of the walnut reference genome (Chandler v2.0) obtained by combining Oxford Nanopore long-read sequencing with chromosome conformation capture (Hi-C) technology. Relative to the previous reference genome, the new assembly features an 84.4-fold increase in N50 size, with the 16 chromosomal pseudomolecules assembled and representing 95% of its total length. Using full-length transcripts from single-molecule real-time sequencing, we predicted 37,554 gene models, with a mean gene length higher than the previous gene annotations. Most of the new protein-coding genes (90%) present both start and stop codons, which represents a significant improvement compared with Chandler v1.0 (only 48%). We then tested the potential impact of the new chromosome-level genome on different areas of walnut research. By studying the proteome changes occurring during male flower development, we observed that the virtual proteome obtained from Chandler v2.0 presents fewer artifacts than the previous reference genome, enabling the identification of a new potential pollen allergen in walnut. Also, the new chromosome-scale genome facilitates in-depth studies of intraspecies genetic diversity by revealing previously undetected autozygous regions in Chandler, likely resulting from inbreeding, and 195 genomic regions highly differentiated between Western and Eastern walnut cultivars. Conclusion Overall, Chandler v2.0 will serve as a valuable resource to better understand and explore walnut biology.more » « less
An official website of the United States government
