Abstract Long‐read sequencing is driving a new reality for genome science in which highly contiguous assemblies can be produced efficiently with modest resources. Genome assemblies from long‐read sequences are particularly exciting for understanding the evolution of complex genomic regions that are often difficult to assemble. In this study, we utilized long‐read sequencing data to generate a high‐quality genome assembly for an Antarctic eelpout,Ophthalmolycus amberensis, the first for the globally distributed family Zoarcidae. We used this assembly to understand howO. amberensishas adapted to the harsh Southern Ocean and compared it to another group of Antarctic fishes: the notothenioids. We showed that selection has largely acted on different targets in eelpouts relative to notothenioids. However, we did find some overlap; in both groups, genes involved in membrane structure, thermal tolerance and vision have evidence of positive selection. We found evidence for historical shifts of transposable element activity inO. amberensisand other polar fishes, perhaps reflecting a response to environmental change. We were specifically interested in the evolution of two complex genomic loci known to underlie key adaptations to polar seas: haemoglobin and antifreeze proteins (AFPs). We observed unique evolution of the haemoglobin MN cluster in eelpouts and related fishes in the suborder Zoarcoidei relative to other Perciformes. For AFPs, we identified the first species in the suborder with no evidence ofafpIIIsequences (Cebidichthys violaceus) in the genomic region where they are found in all other Zoarcoidei, potentially reflecting a lineage‐specific loss of this cluster. Beyond polar fishes, our results highlight the power of long‐read sequencing to understand genome evolution. 
                        more » 
                        « less   
                    
                            
                            Genomics of cold adaptations in the Antarctic notothenioid fish radiation
                        
                    
    
            Abstract Numerous novel adaptations characterise the radiation of notothenioids, the dominant fish group in the freezing seas of the Southern Ocean. To improve understanding of the evolution of this iconic fish group, here we generate and analyse new genome assemblies for 24 species covering all major subgroups of the radiation, including five long-read assemblies. We present a new estimate for the onset of the radiation at 10.7 million years ago, based on a time-calibrated phylogeny derived from genome-wide sequence data. We identify a two-fold variation in genome size, driven by expansion of multiple transposable element families, and use the long-read data to reconstruct two evolutionarily important, highly repetitive gene family loci. First, we present the most complete reconstruction to date of the antifreeze glycoprotein gene family, whose emergence enabled survival in sub-zero temperatures, showing the expansion of the antifreeze gene locus from the ancestral to the derived state. Second, we trace the loss of haemoglobin genes in icefishes, the only vertebrates lacking functional haemoglobins, through complete reconstruction of the two haemoglobin gene clusters across notothenioid families. Both the haemoglobin and antifreeze genomic loci are characterised by multiple transposon expansions that may have driven the evolutionary history of these genes. 
        more » 
        « less   
        
    
    
                            - PAR ID:
- 10435748
- Author(s) / Creator(s):
- ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more »
- Date Published:
- Journal Name:
- Nature Communications
- Volume:
- 14
- Issue:
- 1
- ISSN:
- 2041-1723
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Gossmann, Toni (Ed.)Abstract Spiders (Araneae) have a diverse spectrum of morphologies, behaviors, and physiologies. Attempts to understand the genomic-basis of this diversity are often hindered by their large, heterozygous, and AT-rich genomes with high repeat content resulting in highly fragmented, poor-quality assemblies. As a result, the key attributes of spider genomes, including gene family evolution, repeat content, and gene function, remain poorly understood. Here, we used Illumina and Dovetail Chicago technologies to sequence the genome of the long-jawed spider Tetragnatha kauaiensis, producing an assembly distributed along 3,925 scaffolds with an N50 of ∼2 Mb. Using comparative genomics tools, we explore genome evolution across available spider assemblies. Our findings suggest that the previously reported and vast genome size variation in spiders is linked to the different representation and number of transposable elements. Using statistical tools to uncover gene-family level evolution, we find expansions associated with the sensory perception of taste, immunity, and metabolism. In addition, we report strikingly different histories of chemosensory, venom, and silk gene families, with the first two evolving much earlier, affected by the ancestral whole genome duplication in Arachnopulmonata (∼450 Ma) and exhibiting higher numbers. Together, our findings reveal that spider genomes are highly variable and that genomic novelty may have been driven by the burst of an ancient whole genome duplication, followed by gene family and transposable element expansion.more » « less
- 
            null (Ed.)Abstract Setaria viridis (green foxtail) is an important model system for improving cereal crops due to its diploid genome, ease of cultivation, and use of C4 photosynthesis. The S. viridis accession ME034V is exceptionally transformable, but the lack of a sequenced genome for this accession has limited its utility. We present a 397 Mb highly contiguous de novo assembly of ME034V using ultra-long nanopore sequencing technology (read N50 = 41kb). We estimate that this genome is largely complete based on our updated k-mer based genome size estimate of 401 Mb for S. viridis. Genome annotation identified 37,908 protein-coding genes and >300k repetitive elements comprising 46% of the genome. We compared the ME034V assembly with two other previously sequenced Setaria genomes as well as to a diversity panel of 235 S. viridis accessions. We found the genome assemblies to be largely syntenic, but numerous unique polymorphic structural variants were discovered. Several ME034V deletions may be associated with recent retrotransposition of copia and gypsy LTR repeat families, as evidenced by their low genotype frequencies in the sampled population. Lastly, we performed a phylogenomic analysis to identify gene families that have expanded in Setaria, including those involved in specialized metabolism and plant defense response. The high continuity of the ME034V genome assembly validates the utility of ultra-long DNA sequencing to improve genetic resources for emerging model organisms. Structural variation present in Setaria illustrates the importance of obtaining the proper genome reference for genetic experiments. Thus, we anticipate that the ME034V genome will be of significant utility for the Setaria research community.more » « less
- 
            Lytechinus variegatus is a camarodont sea urchin found widely throughout the western Atlantic Ocean in a variety of shallow-water marine habitats. Its distribution, abundance, and amenability to developmental perturbation make it a popular model for ecologists and developmental biologists. Here, we present a chromosomal-level genome assembly of L. variegatus generated from a combination of PacBio long reads, 10× Genomics sequencing, and HiC chromatin interaction sequencing. We show L. variegatus has 19 chromosomes with an assembly size of 870.4 Mb. The contiguity and completeness of this assembly are reflected by a scaffold length N50 of 45.5 Mb and BUSCO completeness score of 95.5%. Ab initio and transcript-informed gene modeling and annotation identified 27,232 genes with an average gene length of 12.6 kb, comprising an estimated 39.5% of the genome. Repetitive regions, on the other hand, make up 45.4% of the genome. Physical mapping of well-studied developmental genes onto each chromosome reveals nonrandom spatial distribution of distinct genes and gene families, which provides insight into how certain gene families may have evolved and are transcriptionally regulated in this species. Lastly, aligning RNA-seq and ATAC-seq data onto this assembly demonstrates the value of highly contiguous, complete genome assemblies for functional genomics analyses that is unattainable with fragmented, incomplete assemblies. This genome will be of great value to the scientific community as a resource for genome evolution, developmental, and ecological studies of this species and the Echinodermata.more » « less
- 
            Within the arachnids, chromosome-level genome assemblies have greatly accelerated the understanding of gene family evolution and developmental genomics in key groups, such as spiders (Araneae), mites and ticks (Acariformes and Parasitiformes). Among other poorly studied arachnid orders that lack genome assemblies altogether are the clade Pedipalpi, which is comprised of three orders that form the sister group of spiders, which diverged over 400 Mya. We close this gap by generating the first chromosome-level assembly from a single specimen of the vinegaroon Mastigoproctus giganteus (Uropygi). We show that this highly complete genome retains plesiomorphic conditions for many gene families that have undergone lineage-specific derivations within the more diverse spiders. Consistent with the phylogenetic position of Uropygi, macrosynteny in the M. giganteus genome substantiates the signature of an ancient whole genome duplication.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    