skip to main content

Title: The genome of cowpea ( Vigna unguiculata [L.] Walp.)

Cowpea (Vigna unguiculata[L.] Walp.) is a major crop for worldwide food and nutritional security, especially in sub‐Saharan Africa, that is resilient to hot and drought‐prone environments. An assembly of the single‐haplotype inbred genome of cowpea IT97K‐499‐35 was developed by exploiting the synergies between single‐molecule real‐time sequencing, optical and genetic mapping, and an assembly reconciliation algorithm. A total of 519 Mb is included in the assembled sequences. Nearly half of the assembled sequence is composed of repetitive elements, which are enriched within recombination‐poor pericentromeric regions. A comparative analysis of these elements suggests that genome size differences betweenVignaspecies are mainly attributable to changes in the amount ofGypsyretrotransposons. Conversely, genes are more abundant in more distal, high‐recombination regions of the chromosomes; there appears to be more duplication of genes within the NBS‐LRR and the SAUR‐like auxin superfamilies compared with other warm‐season legumes that have been sequenced. A surprising outcome is the identification of an inversion of 4.2 Mb among landraces and cultivars, which includes a gene that has been associated in other plants with interactions with the parasitic weedStriga gesnerioides. The genome sequence facilitated the identification of a putative syntelog for multiple organ gigantism in legumes. A revised numbering system has been adopted for cowpea chromosomes based on synteny with common bean (Phaseolus vulgaris). An estimate of nuclear genome size of 640.6 Mbp based on cytometry is presented.

more » « less
Award ID(s):
1814359 1543963 1526742
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  more » ;  ;  ;   « less
Publisher / Repository:
Date Published:
Journal Name:
The Plant Journal
Page Range / eLocation ID:
p. 767-782
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Cowpea,Vigna unguiculata L. Walp., is a diploid warm‐season legume of critical importance as both food and fodder in sub‐Saharan Africa. This species is also grown in Northern Africa, Europe, Latin America, North America, and East to Southeast Asia. To capture the genomic diversity of domesticates of this important legume, de novo genome assemblies were produced for representatives of six subpopulations of cultivated cowpea identified previously from genotyping of several hundred diverse accessions. In the most complete assembly (IT97K‐499‐35), 26,026 core and 4963 noncore genes were identified, with 35,436 pan genes when considering all seven accessions. GO terms associated with response to stress and defense response were highly enriched among the noncore genes, while core genes were enriched in terms related to transcription factor activity, and transport and metabolic processes. Over 5 million single nucleotide polymorphisms (SNPs) relative to each assembly and over 40 structural variants >1 Mb in size were identified by comparing genomes. Vu10 was the chromosome with the highest frequency of SNPs, and Vu04 had the most structural variants. Noncore genes harbor a larger proportion of potentially disruptive variants than core genes, including missense, stop gain, and frameshift mutations; this suggests that noncore genes substantially contribute to diversity within domesticated cowpea.

    more » « less
  2. Abstract

    As a model organism for studies of cell and environmental biology, the free‐living and cosmopolitan ciliateEuplotes vannusshows intriguing features like dual genome architecture (i.e., separate germline and somatic nuclei in each cell/organism), “gene‐sized” chromosomes, stop codon reassignment, programmed ribosomal frameshifting (PRF) and strong resistance to environmental stressors. However, the molecular mechanisms that account for these remarkable traits remain largely unknown. Here we report a combined analysis of de novo assembled high‐quality macronuclear (MAC; i.e., somatic) and partial micronuclear (MIC; i.e., germline) genome sequences forE. vannus, and transcriptome profiling data under varying conditions. The results demonstrate that: (a) the MAC genome contains more than 25,000 complete “gene‐sized” nanochromosomes (~85 Mb haploid genome size) with the N50 ~2.7 kb; (b) although there is a high frequency of frameshifting at stop codons UAA and UAG, we did not observe impaired transcript abundance as a result of PRF in this species as has been reported for other euplotids; (c) the sequence motif 5′‐TA‐3′ is conserved at nearly all internally‐eliminated sequence (IES) boundaries in the MIC genome, and chromosome breakage sites (CBSs) are duplicated and retained in the MAC genome; (d) by profiling the weighted correlation network of genes in the MAC under different environmental stressors, including nutrient scarcity, extreme temperature, salinity and the presence of ammonia, we identified gene clusters that respond to these external physical or chemical stimulations, and (e) we observed a dramatic increase in HSP70 gene transcription under salinity and chemical stresses but surprisingly, not under temperature changes; we link this temperature‐resistance to the evolved loss of temperature stress‐sensitive elements in regulatory regions. Together with the genome resources generated in this study, which are available online atEuplotes vannusGenome Database (, these data provide molecular evidence for understanding the unique biology of highly adaptable microorganisms.

    more » « less
  3. Abstract

    Pallas's cat, or the manul cat (Otocolobus manul), is a small felid native to the grasslands and steppes of central Asia. Population strongholds in Mongolia and China face growing challenges from climate change, habitat fragmentation, poaching, and other sources. These threats, combined with O. manul’s zoo collection popularity and value in evolutionary biology, necessitate improvement of species genomic resources. We used standalone nanopore sequencing to assemble a 2.5 Gb, 61-contig nuclear assembly and 17097 bp mitogenome for O. manul. The primary nuclear assembly had 56× sequencing coverage, a contig N50 of 118 Mb, and a 94.7% BUSCO completeness score for Carnivora-specific genes. High genome collinearity within Felidae permitted alignment-based scaffolding onto the fishing cat (Prionailurus viverrinus) reference genome. Manul contigs spanned all 19 felid chromosomes with an inferred total gap length of less than 400 kilobases. Modified basecalling and variant phasing produced an alternate pseudohaplotype assembly and allele-specific DNA methylation calls; 61 differentially methylated regions were identified between haplotypes. Nearest features included classical imprinted genes, non-coding RNAs, and putative novel imprinted loci. The assembled mitogenome successfully resolved existing discordance between Felinae nuclear and mtDNA phylogenies. All assembly drafts were generated from 158 Gb of sequence using seven minION flow cells.

    more » « less
  4. O’Neill, Rachel (Ed.)
    Abstract Echinometra is the most widespread genus of sea urchin and has been the focus of a wide range of studies in ecology, speciation, and reproduction. However, available genetic data for this genus are generally limited to a few select loci. Here, we present a chromosome-level genome assembly based on 10x Genomics, PacBio, and Hi-C sequencing for Echinometra sp. EZ from the Persian/Arabian Gulf. The genome is assembled into 210 scaffolds totaling 817.8 Mb with an N50 of 39.5 Mb. From this assembly, we determined that the E. sp. EZ genome consists of 2n = 42 chromosomes. BUSCO analysis showed that 95.3% of BUSCO genes were complete. Ab initio and transcript-informed gene modeling and annotation identified 29,405 genes, including a conserved Hox cluster. E. sp. EZ can be found in high-temperature and high-salinity environments, and we therefore compared E. sp. EZ gene families and transcription factors associated with environmental stress response (“defensome”) with other echinoid species with similar high-quality genomic resources. While the number of defensome genes was broadly similar for all species, we identified strong signatures of positive selection in E. sp. EZ noncoding elements near genes involved in environmental response pathways as well as losses of transcription factors important for environmental response. These data provide key insights into the biology of E. sp. EZ as well as the diversification of Echinometra more widely and will serve as a useful tool for the community to explore questions in this taxonomic group and beyond. 
    more » « less
  5. Abstract Background The hard clam Mercenaria mercenaria is a major marine resource along the Atlantic coasts of North America and has been introduced to other continents for resource restoration or aquaculture activities. Significant mortality events have been reported in the species throughout its native range as a result of diseases (microbial infections, leukemia) and acute environmental stress. In this context, the characterization of the hard clam genome can provide highly needed resources to enable basic (e.g., oncogenesis and cancer transmission, adaptation biology) and applied (clam stock enhancement, genomic selection) sciences. Results Using a combination of long and short-read sequencing technologies, a 1.86 Gb chromosome-level assembly of the clam genome was generated. The assembly was scaffolded into 19 chromosomes, with an N50 of 83 Mb. Genome annotation yielded 34,728 predicted protein-coding genes, markedly more than the few other members of the Venerida sequenced so far, with coding regions representing only 2% of the assembly. Indeed, more than half of the genome is composed of repeated elements, including transposable elements. Major chromosome rearrangements were detected between this assembly and another recent assembly derived from a genetically segregated clam stock. Comparative analysis of the clam genome allowed the identification of a marked diversification in immune-related proteins, particularly extensive tandem duplications and expansions in tumor necrosis factors (TNFs) and C1q domain-containing proteins, some of which were previously shown to play a role in clam interactions with infectious microbes. The study also generated a comparative repertoire highlighting the diversity and, in some instances, the specificity of LTR-retrotransposons elements, particularly Steamer elements in bivalves. Conclusions The diversity of immune molecules in M. mercenaria may allow this species to cope with varying and complex microbial and environmental landscapes. The repertoire of transposable elements identified in this study, particularly Steamer elements, should be a prime target for the investigation of cancer cell development and transmission among bivalve mollusks. 
    more » « less