skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Schmutz, Jeremy"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Ingvarsson, P (Ed.)
    Abstract Eucalyptus grandis is a hardwood tree used worldwide as pure species or hybrid partner to breed fast-growing plantation forestry crops that serve as feedstocks of timber and lignocellulosic biomass for pulp, paper, biomaterials, and biorefinery products. The current v2.0 genome reference for the species served as the first reference for the genus and has helped drive the development of molecular breeding tools for eucalypts. Using PacBio HiFi long reads and Omni-C proximity ligation sequencing, we produced an improved, haplotype-phased assembly (v4.0) for TAG0014, an early-generation selection of E. grandis. The 2 haplotypes are 571 Mbp (HAP1) and 552 Mbp (HAP2) in size and consist of 37 and 46 contigs scaffolded onto 11 chromosomes (contig N50 of 28.9 and 16.7 Mbp), respectively. These haplotype assemblies are 70–90 Mbp smaller than the diploid v2.0 assembly but capture all except one of the 22 telomeres, suggesting that substantial redundant sequence was included in the previous assembly. A total of 35,929 (HAP1) and 35,583 (HAP2) gene models were annotated, of which 438 and 472 contain long introns (>10 kbp) in gene models previously (v2.0) identified as multiple smaller genes. These and other improvements have increased gene annotation completeness levels from 93.8 to 99.4% in the v4.0 assembly. We found that 6,493 and 6,346 genes are within tandem duplicate arrays (HAP1 and HAP2, respectively, 18.4 and 17.8% of the total) and >43.8% of the haplotype assemblies consists of repeat elements. Analysis of synteny between the haplotypes and the E. grandis v2.0 reference genome revealed extensive regions of collinearity, but also some major rearrangements, and provided a preview of population and pangenome variation in the species. 
    more » « less
    Free, publicly-accessible full text available May 30, 2026
  2. Birchler, James (Ed.)
    Abstract Ancient whole-genome duplications (WGDs) are believed to facilitate novelty and adaptation by providing the raw fuel for new genes. However, it is unclear how recent WGDs may contribute to evolvability within recent polyploids. Hybridization accompanying some WGDs may combine divergent gene content among diploid species. Some theory and evidence suggest that polyploids have a greater accumulation and tolerance of gene presence-absence and genomic structural variation, but it is unclear to what extent either is true. To test how recent polyploidy may influence pangenomic variation, we sequenced, assembled, and annotated twelve complete, chromosome-scale genomes of Camelina sativa, an allohexaploid biofuel crop with three distinct subgenomes. Using pangenomic comparative analyses, we characterized gene presence-absence and genomic structural variation both within and between the subgenomes. We found over 75% of ortholog gene clusters are core in Camelina sativa and <10% of sequence space was affected by genomic structural rearrangements. In contrast, 19% of gene clusters were unique to one subgenome, and the majority of these were Camelina-specific (no ortholog in Arabidopsis). We identified an inversion that may contribute to vernalization requirements in winter-type Camelina, and an enrichment of Camelina-specific genes with enzymatic processes related to seed oil quality and Camelina’s unique glucosinolate profile. Genes related to these traits exhibited little presence-absence variation. Our results reveal minimal pangenomic variation in this species, and instead show how hybridization accompanied by WGD may benefit polyploids by merging diverged gene content of different species. 
    more » « less
    Free, publicly-accessible full text available November 15, 2025
  3. Abstract Cotton (Gossypium hirsutumL.) is the key renewable fibre crop worldwide, yet its yield and fibre quality show high variability due to genotype-specific traits and complex interactions among cultivars, management practices and environmental factors. Modern breeding practices may limit future yield gains due to a narrow founding gene pool. Precision breeding and biotechnological approaches offer potential solutions, contingent on accurate cultivar-specific data. Here we address this need by generating high-quality reference genomes for three modern cotton cultivars (‘UGA230’, ‘UA48’ and ‘CSX8308’) and updating the ‘TM-1’ cotton genetic standard reference. Despite hypothesized genetic uniformity, considerable sequence and structural variation was observed among the four genomes, which overlap with ancient and ongoing genomic introgressions from ‘Pima’ cotton, gene regulatory mechanisms and phenotypic trait divergence. Differentially expressed genes across fibre development correlate with fibre production, potentially contributing to the distinctive fibre quality traits observed in modern cotton cultivars. These genomes and comparative analyses provide a valuable foundation for future genetic endeavours to enhance global cotton yield and sustainability. 
    more » « less
  4. ABSTRACT Yellow monkeyflowers (Mimulus guttatuscomplex, Phrymaceae) are a powerful system for studying ecological adaptation, reproductive variation, and genome evolution. To initiate pan‐genomics in this group, we present four chromosome‐scale assemblies and annotations of accessions spanning a broad evolutionary spectrum: two from a singleM. guttatuspopulation, one from the closely related selfing speciesM. nasutus, and one from a more divergent speciesM. tilingii. All assemblies are highly complete and resolve centromeric and repetitive regions. Comparative analyses reveal such extensive structural variation in repeat‐rich, gene‐poor regions that large portions of the genome are unalignable across accessions. As a result, thisMimuluspan‐genome is primarily informative in genic regions, underscoring limitations of resequencing approaches in such polymorphic taxa. We document gene presence–absence, investigate the recombination landscape using high‐resolution linkage data, and quantify nucleotide diversity. Surprisingly, pairwise differences at fourfold synonymous sites are exceptionally high—even in regions of very low recombination—reaching ~3.2% within a singleM. guttatuspopulation, ~7% within the interfertileM. guttatusspecies complex (approximately equal to SNP divergence between great apes and Old World monkeys), and ~7.4% between that complex and the reproductively isolatedM. tilingii. Genome‐wide patterns of nucleotide variation show little evidence of linked selection, and instead suggest that the concentration of genes (and likely selected sites) in high‐recombination regions may buffer diversity loss. These assemblies, annotations, and comparative analyses provide a robust genomic foundation forMimulusresearch and offer new insights into the interplay of recombination, structural variation, and molecular evolution in highly diverse plant genomes. 
    more » « less
  5. Somatic embryogenesis-mediated plant regeneration is essential for the genetic manipulation of agronomically important traits in upland cotton. Genotype specific recalcitrance to regeneration is a primary challenge in deploying genome editing and incorporating useful transgenes into elite cotton germplasm. In this study, transcriptomes of a semi-recalcitrant cotton (Gossypium hirsutum L.) genotype ‘Coker312’ were analyzed at two critical stages of somatic embryogenesis that include non-embryogenic callus (NEC) and embryogenic callus (EC) cells, and the results were compared to a non-recalcitrant genotype ‘Jin668’. We discovered 305 differentially expressed genes in Coker312, whereas, in Jin668, about 6-fold more genes (2155) were differentially expressed. A total of 154 differentially expressed genes were common between the two genotypes. Gene enrichment analysis of the upregulated genes identified functional categories, such as lipid transport, embryo development, regulation of transcription, sugar transport, and vitamin biosynthesis, among others. In Coker312 EC cells, five major transcription factors were highly upregulated: LEAFY COTYLEDON 1 (LEC1), WUS-related homeobox 5 (WOX5), ABSCISIC ACID INSENSITIVE3 (ABI3), FUSCA3 (FUS3), and WRKY2. In Jin668, LEC1, BABY BOOM (BBM), FUS3, and AGAMOUS-LIKE15 (AGL15) were highly expressed in EC cells. We also found that gene expression of these embryogenesis genes was typically higher in Jin668 when compared to Coker312. We conclude that significant differences in the expression of the above genes between Coker312 and Jin668 may be a critical factor affecting the regenerative ability of these genotypes. 
    more » « less
  6. Abstract Currents are unique drivers of oceanic phylogeography and thus determine the distribution of marine coastal species, along with past glaciations and sea-level changes. Here we reconstruct the worldwide colonization history of eelgrass (Zostera marinaL.), the most widely distributed marine flowering plant or seagrass from its origin in the Northwest Pacific, based on nuclear and chloroplast genomes. We identified two divergent Pacific clades with evidence for admixture along the East Pacific coast. Two west-to-east (trans-Pacific) colonization events support the key role of the North Pacific Current. Time-calibrated nuclear and chloroplast phylogenies yielded concordant estimates of the arrival ofZ. marinain the Atlantic through the Canadian Arctic, suggesting that eelgrass-based ecosystems, hotspots of biodiversity and carbon sequestration, have only been present there for ~243 ky (thousand years). Mediterranean populations were founded ~44 kya, while extant distributions along western and eastern Atlantic shores were founded at the end of the Last Glacial Maximum (~19 kya), with at least one major refuge being the North Carolina region. The recent colonization and five- to sevenfold lower genomic diversity of the Atlantic compared to the Pacific populations raises concern and opportunity about how Atlantic eelgrass might respond to rapidly warming coastal oceans. 
    more » « less