skip to main content

Title: Chromosome-level genome assembly, annotation, and phylogenomics of the gooseneck barnacle Pollicipes pollicipes
Abstract Background

The barnacles are a group of >2,000 species that have fascinated biologists, including Darwin, for centuries. Their lifestyles are extremely diverse, from free-swimming larvae to sessile adults, and even root-like endoparasites. Barnacles also cause hundreds of millions of dollars of losses annually due to biofouling. However, genomic resources for crustaceans, and barnacles in particular, are lacking.


Using 62× Pacific Biosciences coverage, 189× Illumina whole-genome sequencing coverage, 203× HiC coverage, and 69× CHi-C coverage, we produced a chromosome-level genome assembly of the gooseneck barnacle Pollicipes pollicipes. The P. pollicipes genome is 770 Mb long and its assembly is one of the most contiguous and complete crustacean genomes available, with a scaffold N50 of 47 Mb and 90.5% of the BUSCO Arthropoda gene set. Using the genome annotation produced here along with transcriptomes of 13 other barnacle species, we completed phylogenomic analyses on a nearly 2 million amino acid alignment. Contrary to previous studies, our phylogenies suggest that the Pollicipedomorpha is monophyletic and sister to the Balanomorpha, which alters our understanding of barnacle larval evolution and suggests homoplasy in a number of naupliar characters. We also compared transcriptomes of P. pollicipes nauplius larvae and adults and found that nearly one-half of more » the genes in the genome are differentially expressed, highlighting the vastly different transcriptomes of larvae and adult gooseneck barnacles. Annotation of the genes with KEGG and GO terms reveals that these stages exhibit many differences including cuticle binding, chitin binding, microtubule motor activity, and membrane adhesion.


This study provides high-quality genomic resources for a key group of crustaceans. This is especially valuable given the roles P. pollicipes plays in European fisheries, as a sentinel species for coastal ecosystems, and as a model for studying barnacle adhesion as well as its key position in the barnacle tree of life. A combination of genomic, phylogenetic, and transcriptomic analyses here provides valuable insights into the evolution and development of barnacles.

« less
 ;  ;  ;  ;  ;  ;  
Award ID(s):
Publication Date:
Journal Name:
Oxford University Press
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Background

    The Aldabra giant tortoise (Aldabrachelys gigantea) is one of only two giant tortoise species left in the world. The species is endemic to Aldabra Atoll in Seychelles and is listed as Vulnerable on the International Union for Conservation of Nature Red List (v2.3) due to its limited distribution and threats posed by climate change. Genomic resources for A. gigantea are lacking, hampering conservation efforts for both wild and ex situpopulations. A high-quality genome would also open avenues to investigate the genetic basis of the species’ exceptionally long life span.


    We produced the first chromosome-level de novo genome assembly of A. gigantea using PacBio High-Fidelity sequencing and high-throughput chromosome conformation capture. We produced a 2.37-Gbp assembly with a scaffold N50 of 148.6 Mbp and a resolution into 26 chromosomes. RNA sequencing–assisted gene model prediction identified 23,953 protein-coding genes and 1.1 Gbp of repetitive sequences. Synteny analyses among turtle genomes revealed high levels of chromosomal collinearity even among distantly related taxa. To assess the utility of the high-quality assembly for species conservation, we performed a low-coverage resequencing of 30 individuals from wild populations and two zoo individuals. Our genome-wide population structure analyses detected genetic population structure in the wild and identifiedmore »the most likely origin of the zoo-housed individuals. We further identified putatively deleterious mutations to be monitored.


    We establish a high-quality chromosome-level reference genome for A. gigantea and one of the most complete turtle genomes available. We show that low-coverage whole-genome resequencing, for which alignment to the reference genome is a necessity, is a powerful tool to assess the population structure of the wild population and reveal the geographic origins of ex situ individuals relevant for genetic diversity management and rewilding efforts.

    « less
  2. Abstract Background The western flower thrips, Frankliniella occidentalis (Pergande), is a globally invasive pest and plant virus vector on a wide array of food, fiber, and ornamental crops. The underlying genetic mechanisms of the processes governing thrips pest and vector biology, feeding behaviors, ecology, and insecticide resistance are largely unknown. To address this gap, we present the F. occidentalis draft genome assembly and official gene set. Results We report on the first genome sequence for any member of the insect order Thysanoptera. Benchmarking Universal Single-Copy Ortholog (BUSCO) assessments of the genome assembly (size = 415.8 Mb, scaffold N50 = 948.9 kb) revealed a relatively complete and well-annotated assembly in comparison to other insect genomes. The genome is unusually GC-rich (50%) compared to other insect genomes to date. The official gene set (OGS v1.0) contains 16,859 genes, of which ~ 10% were manually verified and corrected by our consortium. We focused on manual annotation, phylogenetic, and expression evidence analyses for gene sets centered on primary themes in the life histories and activities of plant-colonizing insects. Highlights include the following: (1) divergent clades and large expansions in genes associated with environmental sensing (chemosensory receptors) and detoxification (CYP4, CYP6, and CCE enzymes) of substances encountered in agricultural environments; (2) amore »comprehensive set of salivary gland genes supported by enriched expression; (3) apparent absence of members of the IMD innate immune defense pathway; and (4) developmental- and sex-specific expression analyses of genes associated with progression from larvae to adulthood through neometaboly, a distinct form of maturation differing from either incomplete or complete metamorphosis in the Insecta. Conclusions Analysis of the F. occidentalis genome offers insights into the polyphagous behavior of this insect pest that finds, colonizes, and survives on a widely diverse array of plants. The genomic resources presented here enable a more complete analysis of insect evolution and biology, providing a missing taxon for contemporary insect genomics-based analyses. Our study also offers a genomic benchmark for molecular and evolutionary investigations of other Thysanoptera species.« less
  3. Abstract

    Sequencing, assembly, and annotation of the 26.5 Gbp hexaploid genome of coast redwood (Sequoia sempervirens) was completed leading toward discovery of genes related to climate adaptation and investigation of the origin of the hexaploid genome. Deep-coverage short-read Illumina sequencing data from haploid tissue from a single seed were combined with long-read Oxford Nanopore Technologies sequencing data from diploid needle tissue to create an initial assembly, which was then scaffolded using proximity ligation data to produce a highly contiguous final assembly, SESE 2.1, with a scaffold N50 size of 44.9 Mbp. The assembly included several scaffolds that span entire chromosome arms, confirmed by the presence of telomere and centromere sequences on the ends of the scaffolds. The structural annotation produced 118,906 genes with 113 containing introns that exceed 500 Kbp in length and one reaching 2 Mb. Nearly 19 Gbp of the genome represented repetitive content with the vast majority characterized as long terminal repeats, with a 2.9:1 ratio of Copia to Gypsy elements that may aid in gene expression control. Comparison of coast redwood to other conifers revealed species-specific expansions for a plethora of abiotic and biotic stress response genes, including those involved in fungal disease resistance, detoxification, and physical injury/structural remodeling and othersmore »supporting flavonoid biosynthesis. Analysis of multiple genes that exist in triplicate in coast redwood but only once in its diploid relative, giant sequoia, supports a previous hypothesis that the hexaploidy is the result of autopolyploidy rather than any hybridizations with separate but closely related conifer species.

    « less
  4. Lavrov, Dennis (Ed.)
    Abstract The painted lady butterfly, Vanessa cardui, has the longest migration routes, the widest hostplant diversity, and one of the most complex wing patterns of any insect. Due to minimal culturing requirements, easily characterized wing pattern elements, and technical feasibility of CRISPR/Cas9 genome editing, V. cardui is emerging as a functional genomics model for diverse research programs. Here, we report a high-quality, annotated genome assembly of the V. cardui genome, generated using 84× coverage of PacBio long-read data, which we assembled into 205 contigs with a total length of 425.4 Mb (N50 = 10.3 Mb). The genome was very complete (single-copy complete Benchmarking Universal Single-Copy Orthologs [BUSCO] 97%), with contigs assembled into presumptive chromosomes using synteny analyses. Our annotation used embryonic, larval, and pupal transcriptomes, and 20 transcriptomes across five different wing developmental stages. Gene annotations showed a high level of accuracy and completeness, with 14,437 predicted protein-coding genes. This annotated genome assembly constitutes an important resource for diverse functional genomic studies ranging from the developmental genetic basis of butterfly color pattern, to coevolution with diverse hostplants.
  5. Abstract Background

    Neuropsychiatric disorders afflict a large portion of the global population and constitute a significant source of disability worldwide. Although Genome-wide Association Studies (GWAS) have identified many disorder-associated variants, the underlying regulatory mechanisms linking them to disorders remain elusive, especially those involving distant genomic elements. Expression quantitative trait loci (eQTLs) constitute a powerful means of providing this missing link. However, most eQTL studies in human brains have focused exclusively on cis-eQTLs, which link variants to nearby genes (i.e., those within 1 Mb of a variant). A complete understanding of disease etiology requires a clearer understanding of trans-regulatory mechanisms, which, in turn, entails a detailed analysis of the relationships between variants and expression changes in distant genes.


    By leveraging large datasets from the PsychENCODE consortium, we conducted a genome-wide survey of trans-eQTLs in the human dorsolateral prefrontal cortex. We also performed colocalization and mediation analyses to identify mediators in trans-regulation and use trans-eQTLs to link GWAS loci to schizophrenia risk genes.


    We identified ~80,000 candidate trans-eQTLs (at FDR<0.25) that influence the expression of ~10K target genes (i.e., “trans-eGenes”). We found that many variants associated with these candidate trans-eQTLs overlap with known cis-eQTLs. Moreover, for >60% of these variants (by colocalization), themore »cis-eQTL’s target gene acts as a mediator for the trans-eQTL SNP's effect on the trans-eGene, highlighting examples of cis-mediation as essential for trans-regulation. Furthermore, many of these colocalized variants fall into a discernable pattern wherein cis-eQTL’s target is a transcription factor or RNA-binding protein, which, in turn, targets the gene associated with the candidate trans-eQTL. Finally, we show that trans-regulatory mechanisms provide valuable insights into psychiatric disorders: beyond what had been possible using only cis-eQTLs, we link an additional 23 GWAS loci and 90 risk genes (using colocalization between candidate trans-eQTLs and schizophrenia GWAS loci).


    We demonstrate that the transcriptional architecture of the human brain is orchestrated by both cis- and trans-regulatory variants and found that trans-eQTLs provide insights into brain-disease biology.

    « less