skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.
Attention:The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Thursday, May 14 until 2:00 AM ET on Friday, May 15 due to maintenance. We apologize for the inconvenience.


Title: Structure and sequence evolution in the pennycress ( Thlaspi arvense ) pangenome
Summary Eukaryotic genomes harbor many forms of variation, including nucleotide diversity and structural polymorphisms, which experience natural selection and contribute to genome evolution and biodiversity. However, harnessing this variation for agriculture hinges on our ability to detect, quantify, catalog, and utilize genetic diversity.Here, we explore seven complete genomes of the emerging biofuel crop pennycress (Thlaspi arvense) drawn from across the species’s current genetic diversity to catalogue variation in genome structure and content.Across this new pangenome resource, we find contrasting evolutionary modes in different genomic regions. Gene-poor, repeat-rich pericentromeric regions experience frequent rearrangements, including repeated centromere repositioning. In contrast, conserved gene-dense chromosome arms maintain large-scale synteny across accessions, even in fast-evolving immune genes where microsynteny breaks down across species but the macrosynteny of gene cluster positioning is maintained.Our findings highlight that multiple elements of the genome experience dynamic evolution that conserves functional content on the chromosome scale but allows rearrangement and presence-absence variation on a local scale. This diversity is invisible to classical reference-based approaches and highlights the strength and utility of pangenomic resources. These results provide a valuable case study of rapid genomic structural evolution within a species and powerful resources for crop development in an emerging biofuel crop.  more » « less
Award ID(s):
1906486
PAR ID:
10651096
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more » ; ; ; ; ; ; « less
Publisher / Repository:
bioRxiv
Date Published:
Format(s):
Medium: X
Institution:
University of California, Davis
Sponsoring Org:
National Science Foundation
More Like this
  1. Summary White oak (Quercus alba) is an abundant forest tree species across eastern North America that is ecologically, culturally, and economically important.We report the first haplotype‐resolved chromosome‐scale genome assembly ofQ. albaand conduct comparative analyses of genome structure and gene content against other published Fagaceae genomes. We investigate the genetic diversity of this widespread species and the phylogenetic relationships among oaks using whole genome data.Despite strongly conserved chromosome synteny and genome size acrossQuercus, certain gene families have undergone rapid changes in size, including defense genes. Unbiased annotation of resistance (R) genes across oaks revealed that the overall number of R genes is similar across species – as are the chromosomal locations of R gene clusters – but, gene number within clusters is more labile. We found thatQ. albahas high genetic diversity, much of which predates its divergence from other oaks and likely impacts divergence time estimations. Our phylogenetic results highlight widespread phylogenetic discordance across the genus.The white oak genome represents a major new resource for studying genome diversity and evolution inQuercus. Additionally, we show that unbiased gene annotation is key to accurately assessing R gene evolution inQuercus. 
    more » « less
  2. ABSTRACT Yellow monkeyflowers (Mimulus guttatuscomplex, Phrymaceae) are a powerful system for studying ecological adaptation, reproductive variation, and genome evolution. To initiate pan‐genomics in this group, we present four chromosome‐scale assemblies and annotations of accessions spanning a broad evolutionary spectrum: two from a singleM. guttatuspopulation, one from the closely related selfing speciesM. nasutus, and one from a more divergent speciesM. tilingii. All assemblies are highly complete and resolve centromeric and repetitive regions. Comparative analyses reveal such extensive structural variation in repeat‐rich, gene‐poor regions that large portions of the genome are unalignable across accessions. As a result, thisMimuluspan‐genome is primarily informative in genic regions, underscoring limitations of resequencing approaches in such polymorphic taxa. We document gene presence–absence, investigate the recombination landscape using high‐resolution linkage data, and quantify nucleotide diversity. Surprisingly, pairwise differences at fourfold synonymous sites are exceptionally high—even in regions of very low recombination—reaching ~3.2% within a singleM. guttatuspopulation, ~7% within the interfertileM. guttatusspecies complex (approximately equal to SNP divergence between great apes and Old World monkeys), and ~7.4% between that complex and the reproductively isolatedM. tilingii. Genome‐wide patterns of nucleotide variation show little evidence of linked selection, and instead suggest that the concentration of genes (and likely selected sites) in high‐recombination regions may buffer diversity loss. These assemblies, annotations, and comparative analyses provide a robust genomic foundation forMimulusresearch and offer new insights into the interplay of recombination, structural variation, and molecular evolution in highly diverse plant genomes. 
    more » « less
  3. Summary The fungal pathogen,Magnaporthe oryzae Triticumpathotype, causing wheat blast disease was first identified in South America and recently spread across continents to South Asia and Africa. Here, we studied the genetic relationship among isolates found on the three continents.Magnaporthe oryzaestrains closely related to a South American field isolate B71 were found to have caused the wheat blast outbreaks in South Asia and Africa. Genomic variation among isolates from the three continents was examined using an improved B71 reference genome and whole‐genome sequences.We found strong evidence to support that the outbreaks in Bangladesh and Zambia were caused by the introductions of genetically separated isolates, although they were all close to B71 and, therefore, collectively referred to as the B71 branch. In addition, B71 branch strains carried at least one supernumerary mini‐chromosome. Genome assembly of a Zambian strain revealed that its mini‐chromosome was similar to the B71 mini‐chromosome but with a high level of structural variation.Our findings show that while core genomes of the multiple introductions are highly similar, the mini‐chromosomes have undergone marked diversification. The maintenance of the mini‐chromosome and rapid genomic changes suggest the mini‐chromosomes may serve important virulence or niche adaptation roles under diverse environmental conditions. 
    more » « less
  4. Summary Water scarcity, resulting from climate change, poses a significant threat to ecosystems.Syntrichia ruralis, a dryland desiccation‐tolerant moss, provides valuable insights into survival of water‐limited conditions.We sequenced the genome ofS. ruralis, conducted transcriptomic analyses, and performed comparative genomic and transcriptomic analyses with existing genomes and transcriptomes, including with the close relativeS. caninervis. We took a genetic approach to characterize the role of anS. ruralistranscription factor, identified in transcriptomic analyses, inArabidopsis thaliana.The genome was assembled into 12 chromosomes encompassing 21 169 protein‐coding genes. Comparative analysis revealed copy number and transcript abundance differences in known desiccation‐associated gene families, and highlighted genome‐level variation among species that may reflect adaptation to different habitats. A significant number of abscisic acid (ABA)‐responsive genes were found to be negatively regulated by a MYB transcription factor (MYB55) that was upstream of theS. ruralisortholog of ABA‐insensitive 3 (ABI3). We determined that this conserved MYB transcription factor, uncharacterized inArabidopsis, acts as a negative regulator of an ABA‐dependent stress response inArabidopsis.The new genomic resources from this emerging model moss offer novel insights into how plants regulate their responses to water deprivation. 
    more » « less
  5. Birchler, James (Ed.)
    Abstract Ancient whole-genome duplications (WGDs) are believed to facilitate novelty and adaptation by providing the raw fuel for new genes. However, it is unclear how recent WGDs may contribute to evolvability within recent polyploids. Hybridization accompanying some WGDs may combine divergent gene content among diploid species. Some theory and evidence suggest that polyploids have a greater accumulation and tolerance of gene presence-absence and genomic structural variation, but it is unclear to what extent either is true. To test how recent polyploidy may influence pangenomic variation, we sequenced, assembled, and annotated twelve complete, chromosome-scale genomes of Camelina sativa, an allohexaploid biofuel crop with three distinct subgenomes. Using pangenomic comparative analyses, we characterized gene presence-absence and genomic structural variation both within and between the subgenomes. We found over 75% of ortholog gene clusters are core in Camelina sativa and <10% of sequence space was affected by genomic structural rearrangements. In contrast, 19% of gene clusters were unique to one subgenome, and the majority of these were Camelina-specific (no ortholog in Arabidopsis). We identified an inversion that may contribute to vernalization requirements in winter-type Camelina, and an enrichment of Camelina-specific genes with enzymatic processes related to seed oil quality and Camelina’s unique glucosinolate profile. Genes related to these traits exhibited little presence-absence variation. Our results reveal minimal pangenomic variation in this species, and instead show how hybridization accompanied by WGD may benefit polyploids by merging diverged gene content of different species. 
    more » « less