skip to main content


Title: Chromosome-level genome assembly, annotation, and phylogenomics of the gooseneck barnacle Pollicipes pollicipes
Abstract Background

The barnacles are a group of >2,000 species that have fascinated biologists, including Darwin, for centuries. Their lifestyles are extremely diverse, from free-swimming larvae to sessile adults, and even root-like endoparasites. Barnacles also cause hundreds of millions of dollars of losses annually due to biofouling. However, genomic resources for crustaceans, and barnacles in particular, are lacking.

Results

Using 62× Pacific Biosciences coverage, 189× Illumina whole-genome sequencing coverage, 203× HiC coverage, and 69× CHi-C coverage, we produced a chromosome-level genome assembly of the gooseneck barnacle Pollicipes pollicipes. The P. pollicipes genome is 770 Mb long and its assembly is one of the most contiguous and complete crustacean genomes available, with a scaffold N50 of 47 Mb and 90.5% of the BUSCO Arthropoda gene set. Using the genome annotation produced here along with transcriptomes of 13 other barnacle species, we completed phylogenomic analyses on a nearly 2 million amino acid alignment. Contrary to previous studies, our phylogenies suggest that the Pollicipedomorpha is monophyletic and sister to the Balanomorpha, which alters our understanding of barnacle larval evolution and suggests homoplasy in a number of naupliar characters. We also compared transcriptomes of P. pollicipes nauplius larvae and adults and found that nearly one-half of the genes in the genome are differentially expressed, highlighting the vastly different transcriptomes of larvae and adult gooseneck barnacles. Annotation of the genes with KEGG and GO terms reveals that these stages exhibit many differences including cuticle binding, chitin binding, microtubule motor activity, and membrane adhesion.

Conclusion

This study provides high-quality genomic resources for a key group of crustaceans. This is especially valuable given the roles P. pollicipes plays in European fisheries, as a sentinel species for coastal ecosystems, and as a model for studying barnacle adhesion as well as its key position in the barnacle tree of life. A combination of genomic, phylogenetic, and transcriptomic analyses here provides valuable insights into the evolution and development of barnacles.

 
more » « less
Award ID(s):
2010898
NSF-PAR ID:
10363697
Author(s) / Creator(s):
; ; ; ; ; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
GigaScience
Volume:
11
ISSN:
2047-217X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Abstract Background The western flower thrips, Frankliniella occidentalis (Pergande), is a globally invasive pest and plant virus vector on a wide array of food, fiber, and ornamental crops. The underlying genetic mechanisms of the processes governing thrips pest and vector biology, feeding behaviors, ecology, and insecticide resistance are largely unknown. To address this gap, we present the F. occidentalis draft genome assembly and official gene set. Results We report on the first genome sequence for any member of the insect order Thysanoptera. Benchmarking Universal Single-Copy Ortholog (BUSCO) assessments of the genome assembly (size = 415.8 Mb, scaffold N50 = 948.9 kb) revealed a relatively complete and well-annotated assembly in comparison to other insect genomes. The genome is unusually GC-rich (50%) compared to other insect genomes to date. The official gene set (OGS v1.0) contains 16,859 genes, of which ~ 10% were manually verified and corrected by our consortium. We focused on manual annotation, phylogenetic, and expression evidence analyses for gene sets centered on primary themes in the life histories and activities of plant-colonizing insects. Highlights include the following: (1) divergent clades and large expansions in genes associated with environmental sensing (chemosensory receptors) and detoxification (CYP4, CYP6, and CCE enzymes) of substances encountered in agricultural environments; (2) a comprehensive set of salivary gland genes supported by enriched expression; (3) apparent absence of members of the IMD innate immune defense pathway; and (4) developmental- and sex-specific expression analyses of genes associated with progression from larvae to adulthood through neometaboly, a distinct form of maturation differing from either incomplete or complete metamorphosis in the Insecta. Conclusions Analysis of the F. occidentalis genome offers insights into the polyphagous behavior of this insect pest that finds, colonizes, and survives on a widely diverse array of plants. The genomic resources presented here enable a more complete analysis of insect evolution and biology, providing a missing taxon for contemporary insect genomics-based analyses. Our study also offers a genomic benchmark for molecular and evolutionary investigations of other Thysanoptera species. 
    more » « less
  2. Abstract Background

    The Aldabra giant tortoise (Aldabrachelys gigantea) is one of only two giant tortoise species left in the world. The species is endemic to Aldabra Atoll in Seychelles and is listed as Vulnerable on the International Union for Conservation of Nature Red List (v2.3) due to its limited distribution and threats posed by climate change. Genomic resources for A. gigantea are lacking, hampering conservation efforts for both wild and ex situpopulations. A high-quality genome would also open avenues to investigate the genetic basis of the species’ exceptionally long life span.

    Findings

    We produced the first chromosome-level de novo genome assembly of A. gigantea using PacBio High-Fidelity sequencing and high-throughput chromosome conformation capture. We produced a 2.37-Gbp assembly with a scaffold N50 of 148.6 Mbp and a resolution into 26 chromosomes. RNA sequencing–assisted gene model prediction identified 23,953 protein-coding genes and 1.1 Gbp of repetitive sequences. Synteny analyses among turtle genomes revealed high levels of chromosomal collinearity even among distantly related taxa. To assess the utility of the high-quality assembly for species conservation, we performed a low-coverage resequencing of 30 individuals from wild populations and two zoo individuals. Our genome-wide population structure analyses detected genetic population structure in the wild and identified the most likely origin of the zoo-housed individuals. We further identified putatively deleterious mutations to be monitored.

    Conclusions

    We establish a high-quality chromosome-level reference genome for A. gigantea and one of the most complete turtle genomes available. We show that low-coverage whole-genome resequencing, for which alignment to the reference genome is a necessity, is a powerful tool to assess the population structure of the wild population and reveal the geographic origins of ex situ individuals relevant for genetic diversity management and rewilding efforts.

     
    more » « less
  3. Lavrov, Dennis (Ed.)
    Abstract The painted lady butterfly, Vanessa cardui, has the longest migration routes, the widest hostplant diversity, and one of the most complex wing patterns of any insect. Due to minimal culturing requirements, easily characterized wing pattern elements, and technical feasibility of CRISPR/Cas9 genome editing, V. cardui is emerging as a functional genomics model for diverse research programs. Here, we report a high-quality, annotated genome assembly of the V. cardui genome, generated using 84× coverage of PacBio long-read data, which we assembled into 205 contigs with a total length of 425.4 Mb (N50 = 10.3 Mb). The genome was very complete (single-copy complete Benchmarking Universal Single-Copy Orthologs [BUSCO] 97%), with contigs assembled into presumptive chromosomes using synteny analyses. Our annotation used embryonic, larval, and pupal transcriptomes, and 20 transcriptomes across five different wing developmental stages. Gene annotations showed a high level of accuracy and completeness, with 14,437 predicted protein-coding genes. This annotated genome assembly constitutes an important resource for diverse functional genomic studies ranging from the developmental genetic basis of butterfly color pattern, to coevolution with diverse hostplants. 
    more » « less
  4. Abstract

    Despite being quite specious (~10,000 extant species), birds have a fairly uniform genome size and karyotype (including the common occurrence of microchromosomes) relative to other vertebrate lineages. Storks (Family Ciconiidae) are a charismatic and distinct group of large wading birds with nearly worldwide distribution but few genomic resources. Here we present an annotated chromosome-level reference genome and chromosome orthology analysis for the wood stork (Mycteria americana), a species that has been federally protected under the Endangered Species Act since 1984. The annotated chromosome-level reference assembly was produced using the blood of a wild female wood stork chick, has a length of 1.35 Gb, a contig N50 of 37 Mb, a scaffold N50 of 80 Mb, and a BUSCO score of 98.8%. We identified 31 autosomal pairs and two sex chromosomes in the wood stork genome, but failed to identify four additional autosomal microchromosomes previously found via karyotyping. Orthology analyses confirmed reported synapomorphies unique to storks and identified the chromosomes participating in these fusions. This study highlights the difficulty and potential problems associated with delineating microchromosomes in reference genome assemblies. It also provides a foundation for studying karyotype evolution in the core water bird clade that includes penguins, albatrosses, storks, cormorants, herons, and ibises. Finally, our reference genome will allow for numerous genomic studies, such as genome-wide association studies of local adaptation, that will aid in wood stork conservation.

     
    more » « less
  5. Abstract

    Sequencing, assembly, and annotation of the 26.5 Gbp hexaploid genome of coast redwood (Sequoia sempervirens) was completed leading toward discovery of genes related to climate adaptation and investigation of the origin of the hexaploid genome. Deep-coverage short-read Illumina sequencing data from haploid tissue from a single seed were combined with long-read Oxford Nanopore Technologies sequencing data from diploid needle tissue to create an initial assembly, which was then scaffolded using proximity ligation data to produce a highly contiguous final assembly, SESE 2.1, with a scaffold N50 size of 44.9 Mbp. The assembly included several scaffolds that span entire chromosome arms, confirmed by the presence of telomere and centromere sequences on the ends of the scaffolds. The structural annotation produced 118,906 genes with 113 containing introns that exceed 500 Kbp in length and one reaching 2 Mb. Nearly 19 Gbp of the genome represented repetitive content with the vast majority characterized as long terminal repeats, with a 2.9:1 ratio of Copia to Gypsy elements that may aid in gene expression control. Comparison of coast redwood to other conifers revealed species-specific expansions for a plethora of abiotic and biotic stress response genes, including those involved in fungal disease resistance, detoxification, and physical injury/structural remodeling and others supporting flavonoid biosynthesis. Analysis of multiple genes that exist in triplicate in coast redwood but only once in its diploid relative, giant sequoia, supports a previous hypothesis that the hexaploidy is the result of autopolyploidy rather than any hybridizations with separate but closely related conifer species.

     
    more » « less