skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Sequencing Bait: Nuclear and Mitogenome Assembly of an Abundant Coastal Tropical and Subtropical Fish, Atherinomorus stipes
Abstract Genetic data from nonmodel species can inform ecology and physiology, giving insight into a species’ distribution and abundance as well as their responses to changing environments, all of which are important for species conservation and management. Moreover, reduced sequencing costs and improved long-read sequencing technology allows researchers to readily generate genomic resources for nonmodel species. Here, we apply Oxford Nanopore long-read sequencing and low-coverage (∼1x) whole genome short-read sequencing technology (Illumina) to assemble a genome and examine population genetics of an abundant tropical and subtropical fish, the hardhead silverside (Atherinomorus stipes). These fish are found in shallow coastal waters and are frequently included in ecological models because they serve as abundant prey for commercially and ecologically important species. Despite their importance in sub-tropical and tropical ecosystems, little is known about their population connectivity and genetic diversity. Our A. stipes genome assembly is about 1.2 Gb with comparable repetitive element content (∼47%), number of protein duplication events, and DNA methylation patterns to other teleost fish species. Among five sampled populations spanning 43 km of South Florida and the Florida Keys, we find little population structure suggesting high population connectivity.  more » « less
Award ID(s):
1754437 1556396
PAR ID:
10464967
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ;
Editor(s):
Costantini, Maria
Date Published:
Journal Name:
Genome Biology and Evolution
Volume:
14
Issue:
8
ISSN:
1759-6653
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract The spiral gingers (Costus L.) are a pantropical genus of herbaceous perennial monocots; the Neotropical clade of Costus radiated rapidly in the past few million years into over 60 species. The Neotropical spiral gingers have a rich history of evolutionary and ecological research that can motivate and inform modern genetic investigations. Here, we present the first 2 chromosome-level genome assemblies in the genus, for C. pulverulentus and C. lasius, and briefly compare their synteny. We assembled the C. pulverulentus genome from a combination of short-read data, Chicago and Dovetail Hi-C chromatin-proximity sequencing, and alignment with a linkage map. We annotated the genome by mapping a C. pulverulentus transcriptome and querying mapped transcripts against a protein database. We assembled the C. lasius genome with Pacific Biosciences HiFi long reads and alignment to the C. pulverulentus genome. These 2 assemblies are the first published genomes for non-cultivated tropical plants. These genomes solidify the spiral gingers as a model system and will facilitate research on the poorly understood genetic basis of tropical plant diversification. 
    more » « less
  2. Alternate isoforms are important contributors to phenotypic diversity across eukaryotes. Although short-read RNA-sequencing has increased our understanding of isoform diversity, it is challenging to accurately detect full-length transcripts, preventing the identification of many alternate isoforms. Long-read sequencing technologies have made it possible to sequence full-length alternative transcripts, accurately characterizing alternative splicing events, alternate transcription start and end sites, and differences in UTR regions. Here, we use Pacific Biosciences (PacBio) long-read RNA-sequencing (Iso-Seq) to examine the transcriptomes of five organs in threespine stickleback fish ( Gasterosteus aculeatus ), a widely used genetic model species. The threespine stickleback fish has a refined genome assembly in which gene annotations are based on short-read RNA sequencing and predictions from coding sequence of other species. This suggests some of the existing annotations may be inaccurate or alternative transcripts may not be fully characterized. Using Iso-Seq we detected thousands of novel isoforms, indicating many isoforms are absent in the current Ensembl gene annotations. In addition, we refined many of the existing annotations within the genome. We noted many improperly positioned transcription start sites that were refined with long-read sequencing. The Iso-Seq-predicted transcription start sites were more accurate and verified through ATAC-seq. We also detected many alternative splicing events between sexes and across organs. We found a substantial number of genes in both somatic and gonadal samples that had sex-specific isoforms. Our study highlights the power of long-read sequencing to study the complexity of transcriptomes, greatly improving genomic resources for the threespine stickleback fish. 
    more » « less
  3. Macqueen, D (Ed.)
    Abstract While the cost and time for assembling a genome has drastically decreased, it still remains a challenge to assemble a highly contiguous genome. These challenges are rapidly being overcome by the integration of long-read sequencing technologies. Here, we use long-read sequencing to improve the contiguity of the threespine stickleback fish (Gasterosteus aculeatus) genome, a prominent genetic model species. Using Pacific Biosciences sequencing, we assembled a highly contiguous genome of a freshwater fish from Paxton Lake. Using contigs from this genome, we were able to fill over 76.7% of the gaps in the existing reference genome assembly, improving contiguity over fivefold. Our gap filling approach was highly accurate, validated by 10X Genomics long-distance linked-reads. In addition to closing a majority of gaps, we were able to assemble segments of telomeres and centromeres throughout the genome. This highlights the power of using long sequencing reads to assemble highly repetitive and difficult to assemble regions of genomes. This latest genome build has been released through a newly designed community genome browser that aims to consolidate the growing number of genomics datasets available for the threespine stickleback fish. 
    more » « less
  4. null (Ed.)
    Abstract Setaria viridis (green foxtail) is an important model system for improving cereal crops due to its diploid genome, ease of cultivation, and use of C4 photosynthesis. The S. viridis accession ME034V is exceptionally transformable, but the lack of a sequenced genome for this accession has limited its utility. We present a 397 Mb highly contiguous de novo assembly of ME034V using ultra-long nanopore sequencing technology (read N50 = 41kb). We estimate that this genome is largely complete based on our updated k-mer based genome size estimate of 401 Mb for S. viridis. Genome annotation identified 37,908 protein-coding genes and >300k repetitive elements comprising 46% of the genome. We compared the ME034V assembly with two other previously sequenced Setaria genomes as well as to a diversity panel of 235 S. viridis accessions. We found the genome assemblies to be largely syntenic, but numerous unique polymorphic structural variants were discovered. Several ME034V deletions may be associated with recent retrotransposition of copia and gypsy LTR repeat families, as evidenced by their low genotype frequencies in the sampled population. Lastly, we performed a phylogenomic analysis to identify gene families that have expanded in Setaria, including those involved in specialized metabolism and plant defense response. The high continuity of the ME034V genome assembly validates the utility of ultra-long DNA sequencing to improve genetic resources for emerging model organisms. Structural variation present in Setaria illustrates the importance of obtaining the proper genome reference for genetic experiments. Thus, we anticipate that the ME034V genome will be of significant utility for the Setaria research community. 
    more » « less
  5. Abstract Northern sand lance (Ammodytes dubius) are essential forage fish in most offshore, temperate-to-polar waters on the Northwest Atlantic shelf (NWA), but their population structure and genetic separation from the American sand lance (A. americanus) remain unresolved. We assembled a reference genome for A. dubius (first in the Ammodytidae) and then used low-coverage whole genome sequencing on 262 specimens collected across the species distribution (Mid-Atlantic Bight to Greenland) to quantify genetic differentiation between geographic regions based on single nucleotide polymorphisms. We found strong separation between A. dubius from locations north and south of the Scotian Shelf, largely due to massive genetic differentiation spanning most of chromosomes 21 and 24. Genetic distance increased with geographic distance in the smaller southern cluster but not in the larger northern cluster, where genetic homogeneity appeared across large geographic distances (>103 km). The two genetic clusters coincide with a clear break in winter sea surface temperature, suggesting that differential offspring survival, rather than limited transport, causes a break in realized connectivity. Nuclear and mitochondrial DNA both clearly delineated A. dubius from A. americanus, thereby confirming a species boundary through spatial niche partitioning into inshore (A. americanus) and offshore (A. dubius) sand lance species on the NWA. 
    more » « less