Abstract Advances in genome sequencing and annotation have eased the difficulty of identifying new gene sequences. Predicting the functions of these newly identified genes remains challenging. Genes descended from a common ancestral sequence are likely to have common functions. As a result, homology is widely used for gene function prediction. This means functional annotation errors also propagate from one species to another. Several approaches based on machine learning classification algorithms were evaluated for their ability to accurately predict gene function from non‐homology gene features. Among the eight supervised classification algorithms evaluated, random‐forest‐based prediction consistently provided the most accurate gene function prediction. Non‐homology‐based functional annotation provides complementary strengths to homology‐based annotation, with higher average performance in Biological Process GO terms, the domain where homology‐based functional annotation performs the worst, and weaker performance in Molecular Function GO terms, the domain where the accuracy of homology‐based functional annotation is highest. GO prediction models trained with homology‐based annotations were able to successfully predict annotations from a manually curated “gold standard” GO annotation set. Non‐homology‐based functional annotation based on machine learning may ultimately prove useful both as a method to assign predicted functions to orphan genes which lack functionally characterized homologs, and to identify and correct functional annotation errors which were propagated through homology‐based functional annotations.
more »
« less
Origins of lineage‐specific elements via gene duplication, relocation, and regional rearrangement in Neurospora crassa
Abstract The origin of new genes has long been a central interest of evolutionary biologists. However, their novelty means that they evade reconstruction by the classical tools of evolutionary modelling. This evasion of deep ancestral investigation necessitates intensive study of model species within well‐sampled, recently diversified, clades. One such clade is the model genusNeurospora, members of which lack recent gene duplications. SeveralNeurosporaspecies are comprehensively characterized organisms apt for studying the evolution of lineage‐specific genes (LSGs). Using gene synteny, we documented that 78% ofNeurosporaLSG clusters are located adjacent to the telomeres featuring extensive tracts of non‐coding DNA and duplicated genes. Here, we report several instances of LSGs that are likely from regional rearrangements and potentially from gene rebirth. To broadly investigate the functions of LSGs, we assembled transcriptomics data from 68 experimental data points and identified co‐regulatory modules using Weighted Gene Correlation Network Analysis, revealing that LSGs are widely but peripherally involved in known regulatory machinery for diverse functions. The ancestral status of the LSGmas‐1, a gene with roles in cell‐wall integrity and cellular sensitivity to antifungal toxins, was investigated in detail alongside its genomic neighbours, indicating that it arose from an ancient lysophospholipase precursor that is ubiquitous in lineages of the Sordariomycetes. Our discoveries illuminate a “rummage region” in theN. crassagenome that enables the formation of new genes and functions to arise via gene duplication and relocation, followed by fast mutation and recombination facilitated by sequence repeats and unconstrained non‐coding sequences.
more »
« less
- PAR ID:
- 10559425
- Publisher / Repository:
- Wiley-Blackwell
- Date Published:
- Journal Name:
- Molecular Ecology
- Volume:
- 33
- Issue:
- 24
- ISSN:
- 0962-1083
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract The evolutionary direction of gonochorism and hermaphroditism is an intriguing mystery to be solved. The special transient hermaphroditic stage makes the little yellow croaker (Larimichthys polyactis) an appealing model for studying hermaphrodite formation. However, the origin and evolutionary relationship between ofL. polyactisandLarimichthys crocea, the most famous commercial fish species in East Asia, remain unclear. Here, we report the sequence of theL. polyactisgenome, which we found is ~706 Mb long (contig N50 = 1.21 Mb and scaffold N50 = 4.52 Mb) and contains 25,233 protein‐coding genes. Phylogenomic analysis suggested thatL. polyactisdiverged from the common ancestor,L. crocea, approximately 25.4 million years ago. Our high‐quality genome assembly enabled comparative genomic analysis, which revealed several within‐chromosome rearrangements and translocations, without major chromosome fission or fusion events between the two species. Thedmrt1gene was identified as the male‐specific gene inL. polyactis. Transcriptome analysis showed that the expression ofdmrt1and its upstream regulatory gene (rnf183) were both sexually dimorphic.Rnf183, unlike its two paraloguesrnf223andrnf225, is only present inLarimichthysandLatesbut not in other teleost species, suggesting that it originated from lineage‐specific duplication or was lost in other teleosts.Phylogenetic analysis shows that the hermaphrodite stage in maleL. polyactismay be explained by the sequence evolution ofdmrt1. Decoding theL. polyactisgenome not only provides insight into the genetic underpinnings of hermaphrodite evolution, but also provides valuable information for enhancing fish aquaculture.more » « less
-
Previous evolutionary models of duplicate gene evolution have overlooked the pivotal role of genome architecture. Here, we show that proximity-based regulatory recruitment by distally duplicated genes is an efficient mechanism for modulating tissue-specific production of preexisting proteins. By leveraging genomic asymmetries, we performed a coexpression analysis onDrosophila melanogastertissue data to show the generality of enhancer capture-divergence (ECD) as a significant evolutionary driver of asymmetric, distally duplicated genes. We use the recently evolved geneHP6/Umbreaas an example of the ECD process. By assaying genome-wide chromosomal conformations in multipleDrosophilaspecies, we show thatHP6/Umbreawas inserted near a preexisting, long-distance three-dimensional genomic interaction. We then use this data to identify a newly found enhancer (FLEE1), buried within the coding region of the highly conserved, essential geneMFS18, that likely neofunctionalizedHP6/Umbrea. Last, we demonstrate ancestral transcriptional coregulation ofHP6/Umbrea’s future insertion site, illustrating how enhancer capture provides a highly evolvable, one-step solution to Ohno’s dilemma.more » « less
-
Abstract The phyla Nitrospirota and Nitrospinota have received significant research attention due to their unique nitrogen metabolisms important to biogeochemical and industrial processes. These phyla are common inhabitants of marine and terrestrial subsurface environments and contain members capable of diverse physiologies in addition to nitrite oxidation and complete ammonia oxidation. Here, we use phylogenomics and gene-based analysis with ancestral state reconstruction and gene-tree–species-tree reconciliation methods to investigate the life histories of these two phyla. We find that basal clades of both phyla primarily inhabit marine and terrestrial subsurface environments. The genomes of basal clades in both phyla appear smaller and more densely coded than the later-branching clades. The extant basal clades of both phyla share many traits inferred to be present in their respective common ancestors, including hydrogen, one-carbon, and sulfur-based metabolisms. Later-branching groups, namely the more frequently studied classes Nitrospiria and Nitrospinia, are both characterized by genome expansions driven by either de novo origination or laterally transferred genes that encode functions expanding their metabolic repertoire. These expansions include gene clusters that perform the unique nitrogen metabolisms that both phyla are most well known for. Our analyses support replicated evolutionary histories of these two bacterial phyla, with modern subsurface environments representing a genomic repository for the coding potential of ancestral metabolic traits.more » « less
-
Abstract Saline migrants into freshwater habitats constitute among the most destructive invaders in aquatic ecosystems throughout the globe. However, the evolutionary and physiological mechanisms underlying such habitat transitions remain poorly understood. To explore the mechanisms of freshwater adaptation and distinguish between adaptive (evolutionary) and acclimatory (plastic) responses to salinity change, we examined genome‐wide patterns of gene expression between ancestral saline and derived freshwater populations of theEurytemora affinisspecies complex, reared under two different common‐garden conditions (0 versus 15 PSU). We found that evolutionary shifts in gene expression (between saline and freshwater inbred lines) showed far greater changes and were more widespread than acclimatory responses to salinity (0 versus 15 PSU). Most notably, 30–40 genes showing evolutionary shifts in gene expression across the salinity boundary were associated with ion transport function, withinorganic cation transmembrane transportforming the largest Gene Ontology category. Of particular interest was the sodium transporter, the Na+/H+antiporter (NHA) gene family, which was discovered in animals relatively recently. Thirty key ion regulatory genes, such as NHA paralogue #7, demonstrated concordant evolutionary and plastic shifts in gene expression, suggesting the evolution of ion transporter function and plasticity during rapid invasions into novel salinities. Moreover, freshwater invasions were associated with the evolution of reduced plasticity in the freshwater population, again for the same key ion transporters, consistent with the predicted evolution of canalization following adaptation to stressful conditions. Our results have important implications for understanding evolutionary and physiological mechanisms of range expansions by some of the most widespread invaders in aquatic habitats.more » « less
An official website of the United States government
