skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on February 1, 2026

Title: Comparison of Recombination Rate, Reference Bias, and Unique Pangenomic Haplotypes in Cannabis sativa Using Seven De Novo Genome Assemblies
Genomic characterization of Cannabis sativa has accelerated rapidly in the last decade as sequencing costs have decreased and public and private interest in the species has increased. Here, we present seven new chromosome-level haplotype-phased genomes of C. sativa. All of these genotypes were alive at the time of publication, and several have numerous years of associated phenotype data. We performed a k-mer-based pangenome analysis to contextualize these assemblies within over 200 existing assemblies. This allowed us to identify unique haplotypes and genomic diversity among Cannabis sativa genotypes. We leveraged linkage maps constructed from F2 progeny of two of the assembled genotypes to characterize the recombination rate across the genome showing strong periphery-biased recombination. Lastly, we re-aligned a bulk segregant analysis dataset for the major-effect flowering locus Early1 to several of the new assemblies to evaluate the impact of reference bias on the mapping results and narrow the locus to a smaller region of the chromosome. These new assemblies, combined with the continued propagation of the genotypes, will contribute to the growing body of genomic resources for C. sativa to accelerate future research efforts.  more » « less
Award ID(s):
2239530
PAR ID:
10569403
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ;
Publisher / Repository:
MDPI
Date Published:
Journal Name:
International Journal of Molecular Sciences
Volume:
26
Issue:
3
ISSN:
1422-0067
Page Range / eLocation ID:
1165
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract BackgroundAfrican cichlid fishes are well known for their rapid radiations and are a model system for studying evolutionary processes. Here we compare multiple, high-quality, chromosome-scale genome assemblies to elucidate the genetic mechanisms underlying cichlid diversification and study how genome structure evolves in rapidly radiating lineages. ResultsWe re-anchored our recent assembly of the Nile tilapia (Oreochromis niloticus) genome using a new high-density genetic map. We also developed a new de novo genome assembly of the Lake Malawi cichlid, Metriaclima zebra, using high-coverage Pacific Biosciences sequencing, and anchored contigs to linkage groups (LGs) using 4 different genetic maps. These new anchored assemblies allow the first chromosome-scale comparisons of African cichlid genomes. Large intra-chromosomal structural differences (~2–28 megabase pairs) among species are common, while inter-chromosomal differences are rare (<10 megabase pairs total). Placement of the centromeres within the chromosome-scale assemblies identifies large structural differences that explain many of the karyotype differences among species. Structural differences are also associated with unique patterns of recombination on sex chromosomes. Structural differences on LG9, LG11, and LG20 are associated with reduced recombination, indicative of inversions between the rock- and sand-dwelling clades of Lake Malawi cichlids. M. zebra has a larger number of recent transposable element insertions compared with O. niloticus, suggesting that several transposable element families have a higher rate of insertion in the haplochromine cichlid lineage. ConclusionThis study identifies novel structural variation among East African cichlid genomes and provides a new set of genomic resources to support research on the mechanisms driving cichlid adaptation and speciation. 
    more » « less
  2. Abstract BackgroundHigh-quality genomic resources facilitate investigations into behavioral ecology, morphological and physiological adaptations, and the evolution of genomic architecture. Lizards in the genus Sceloporus have a long history as important ecological, evolutionary, and physiological models, making them a valuable target for the development of genomic resources. FindingsWe present a high-quality chromosome-level reference genome assembly, SceUnd1.0 (using 10X Genomics Chromium, HiC, and Pacific Biosciences data), and tissue/developmental stage transcriptomes for the eastern fence lizard, Sceloporus undulatus. We performed synteny analysis with other snake and lizard assemblies to identify broad patterns of chromosome evolution including the fusion of micro- and macrochromosomes. We also used this new assembly to provide improved reference-based genome assemblies for 34 additional Sceloporus species. Finally, we used RNAseq and whole-genome resequencing data to compare 3 assemblies, each representing an increased level of cost and effort: Supernova Assembly with data from 10X Genomics Chromium, HiRise Assembly that added data from HiC, and PBJelly Assembly that added data from Pacific Biosciences sequencing. We found that the Supernova Assembly contained the full genome and was a suitable reference for RNAseq and single-nucleotide polymorphism calling, but the chromosome-level scaffolds provided by the addition of HiC data allowed synteny and whole-genome association mapping analyses. The subsequent addition of PacBio data doubled the contig N50 but provided negligible gains in scaffold length. ConclusionsThese new genomic resources provide valuable tools for advanced molecular analysis of an organism that has become a model in physiology and evolutionary ecology. 
    more » « less
  3. null (Ed.)
    Abstract Background The Aedes aegypti mosquito is a threat to human health across the globe. The A. aegypti genome was recently re-sequenced and re-assembled. Due to a combination of long-read PacBio and Hi-C sequencing, the AaegL5 assembly is chromosome complete and significantly improves the assembly in key areas such as the M/m sex-determining locus. Release of the updated genome assembly has precipitated the need to reprocess historical functional genomic data sets, including cis -regulatory element (CRE) maps that had previously been generated for A. aegypti. Results We re-processed and re-analyzed the A. aegypti whole embryo FAIRE seq data to create an updated embryonic CRE map for the AaegL5 genome. We validated that the new CRE map recapitulates key features of the original AaegL3 CRE map. Further, we built on the improved assembly in the M/m locus to analyze overlaps of open chromatin regions with genes. To support the validation, we created a new method (PeakMatcher) for matching peaks from the same experimental data set across genome assemblies. Conclusion Use of PeakMatcher software, which is available publicly under an open-source license, facilitated the release of an updated and validated CRE map, which is available through the NIH GEO. These findings demonstrate that PeakMatcher software will be a useful resource for validation and transferring of previous annotations to updated genome assemblies. 
    more » « less
  4. Larracuente, Amanda (Ed.)
    Abstract Chromosome size and morphology vary within and among species, but little is known about the proximate or ultimate causes of these differences. Cichlid fish species in the tribe Oreochromini share an unusual giant chromosome that is ∼3 times longer than the other chromosomes. This giant chromosome functions as a sex chromosome in some of these species. We test two hypotheses of how this giant sex chromosome may have evolved. The first hypothesis proposes that it evolved by accumulating repetitive elements as recombination was reduced around a dominant sex determination locus, as suggested by canonical models of sex chromosome evolution. An alternative hypothesis is that the giant sex chromosome originated via the fusion of an autosome with a highly repetitive B chromosome, one of which carried a sex determination locus. We test these hypotheses using comparative analysis of chromosome-scale cichlid and teleost genomes. We find that the giant sex chromosome consists of three distinct regions based on patterns of recombination, gene and transposable element content, and synteny to the ancestral autosome. The WZ sex determination locus encompasses the last ∼105 Mb of the 134-Mb giant chromosome. The last 47 Mb of the giant chromosome shares no obvious homology to any ancestral chromosome. Comparisons across 69 teleost genomes reveal that the giant sex chromosome contains unparalleled amounts of endogenous retroviral elements, immunoglobulin genes, and long noncoding RNAs. The results favor the B chromosome fusion hypothesis for the origin of the giant chromosome. 
    more » « less
  5. null (Ed.)
    High rates of dispersal can breakdown coadapted gene complexes. However, concentrated genomic architecture (i.e., genomic islands of divergence) can suppress recombination to allow evolution of local adaptations despite high gene flow. Pacific lamprey (Entosphenus tridentatus) is a highly dispersive anadromous fish. Observed trait diversity and evidence for genetic basis of traits suggests it may be locally adapted. We addressed whether concentrated genomic architecture could influence local adaptation for Pacific lamprey. Using two new whole genome assemblies and genotypes from 7,716 single nucleotide polymorphism (SNP) loci in 518 individuals from across the species range, we identified four genomic islands of divergence (on chromosomes 01, 02, 04, and 22). We determined robust phenotype-by-genotype relationships by testing multiple traits across geographic sites. These trait associations probably explain genomic divergence across the species’ range. We genotyped a subset of 302 broadly distributed SNPs in 2,145 individuals for association testing for adult body size, sexual maturity, migration distance and timing, adult swimming ability, and larval growth. Body size traits were strongly associated with SNPs on chromosomes 02 and 04. Moderate associations also implicated SNPs on chromosome 01 as being associated with variation in female maturity. Finally, we used candidate SNPs to extrapolate a heterogeneous spatiotemporal distribution of these predicted phenotypes based on independent data sets of larval and adult collections. These maturity and body size results guide future elucidation of factors driving regional optimization of these traits for fitness. Pacific lamprey is culturally important and imperiled. This research addresses biological uncertainties that challenge restoration efforts. 
    more » « less