skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: The Lost and Found: Unraveling the Functions of Orphan Genes
Orphan Genes (OGs) are a mysterious class of genes that have recently gained significant attention. Despite lacking a clear evolutionary history, they are found in nearly all living organisms, from bacteria to humans, and they play important roles in diverse biological processes. The discovery of OGs was first made through comparative genomics followed by the identification of unique genes across different species. OGs tend to be more prevalent in species with larger genomes, such as plants and animals, and their evolutionary origins remain unclear but potentially arise from gene duplication, horizontal gene transfer (HGT), or de novo origination. Although their precise function is not well understood, OGs have been implicated in crucial biological processes such as development, metabolism, and stress responses. To better understand their significance, researchers are using a variety of approaches, including transcriptomics, functional genomics, and molecular biology. This review offers a comprehensive overview of the current knowledge of OGs in all domains of life, highlighting the possible role of dark transcriptomics in their evolution. More research is needed to fully comprehend the role of OGs in biology and their impact on various biological processes.  more » « less
Award ID(s):
2038872
PAR ID:
10425008
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Journal of Developmental Biology
Volume:
11
Issue:
2
ISSN:
2221-3759
Page Range / eLocation ID:
27
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Hejnol, Andreas (Ed.)
    Molecular evolution studies, such as phylogenomic studies and genome-wide surveys of selection, often rely on gene families of single-copy orthologs (SC-OGs). Large gene families with multiple homologs in 1 or more species—a phenomenon observed among several important families of genes such as transporters and transcription factors—are often ignored because identifying and retrieving SC-OGs nested within them is challenging. To address this issue and increase the number of markers used in molecular evolution studies, we developed OrthoSNAP, a software that uses a phylogenetic framework to simultaneously split gene families into SC-OGs and prune species-specific inparalogs. We term SC-OGs identified by OrthoSNAP as SNAP-OGs because they are identified using a s plitti n g a nd p runing procedure analogous to snapping branches on a tree. From 415,129 orthologous groups of genes inferred across 7 eukaryotic phylogenomic datasets, we identified 9,821 SC-OGs; using OrthoSNAP on the remaining 405,308 orthologous groups of genes, we identified an additional 10,704 SNAP-OGs. Comparison of SNAP-OGs and SC-OGs revealed that their phylogenetic information content was similar, even in complex datasets that contain a whole-genome duplication, complex patterns of duplication and loss, transcriptome data where each gene typically has multiple transcripts, and contentious branches in the tree of life. OrthoSNAP is useful for increasing the number of markers used in molecular evolution data matrices, a critical step for robustly inferring and exploring the tree of life. 
    more » « less
  2. Advances in genomics and transcriptomics accompanying the rapid accumulation of omics data have provided new tools that have transformed and expanded the traditional concepts of model fungi. Evolutionary genomics and transcriptomics have flourished with the use of classical and newer fungal models that facilitate the study of diverse topics encompassing fungal biology and development. Technological advances have also created the opportunity to obtain and mine large datasets. One such continuously growing dataset is that of the Sordariomycetes, which exhibit a richness of species, ecological diversity, economic importance, and a profound research history on amenable models. Currently, 3,574 species of this class have been sequenced, comprising nearly one-third of the available ascomycete genomes. Among these genomes, multiple representatives of the model genera Fusarium , Neurospora , and Trichoderma are present. In this review, we examine recently published studies and data on the Sordariomycetes that have contributed novel insights to the field of fungal evolution via integrative analyses of the genetic, pathogenic, and other biological characteristics of the fungi. Some of these studies applied ancestral state analysis of gene expression among divergent lineages to infer regulatory network models, identify key genetic elements in fungal sexual development, and investigate the regulation of conidial germination and secondary metabolism. Such multispecies investigations address challenges in the study of fungal evolutionary genomics derived from studies that are often based on limited model genomes and that primarily focus on the aspects of biology driven by knowledge drawn from a few model species. Rapidly accumulating information and expanding capabilities for systems biological analysis of Big Data are setting the stage for the expansion of the concept of model systems from unitary taxonomic species/genera to inclusive clusters of well-studied models that can facilitate both the in-depth study of specific lineages and also investigation of trait diversity across lineages. The Sordariomycetes class, in particular, offers abundant omics data and a large and active global research community. As such, the Sordariomycetes can form a core omics clade, providing a blueprint for the expansion of our knowledge of evolution at the genomic scale in the exciting era of Big Data and artificial intelligence, and serving as a reference for the future analysis of different taxonomic levels within the fungal kingdom. 
    more » « less
  3. Drought stress is a key limitation for plant growth and colonization of arid habitats. We study the evolution of gene expression response to drought stress in a wild tomato,Solanum chilense,naturally occurring in dry habitats in South America. We conduct a transcriptome analysis under standard and drought experimental conditions to identify drought‐responsive gene networks and estimate the age of the involved genes. We identify two main regulatory networks corresponding to two typical drought‐responsive strategies: cell cycle and fundamental metabolic processes. The metabolic network exhibits a more recent evolutionary origin and a more variable transcriptome response than the cell cycle network (with ancestral origin and higher conservation of the transcriptional response). We also integrate population genomics analyses to reveal positive selection signals acting at the genes of both networks, revealing that genes exhibiting selective sweeps of older age also exhibit greater connectivity in the networks. These findings suggest that adaptive changes first occur at core genes of drought response networks, driving significant network re‐wiring, which likely underpins species divergence and further spread into drier habitats. Combining transcriptomics and population genomics approaches, we decipher the timing of gene network evolution for drought stress response in arid habitats. 
    more » « less
  4. null (Ed.)
    Abstract Background The western flower thrips, Frankliniella occidentalis (Pergande), is a globally invasive pest and plant virus vector on a wide array of food, fiber, and ornamental crops. The underlying genetic mechanisms of the processes governing thrips pest and vector biology, feeding behaviors, ecology, and insecticide resistance are largely unknown. To address this gap, we present the F. occidentalis draft genome assembly and official gene set. Results We report on the first genome sequence for any member of the insect order Thysanoptera. Benchmarking Universal Single-Copy Ortholog (BUSCO) assessments of the genome assembly (size = 415.8 Mb, scaffold N50 = 948.9 kb) revealed a relatively complete and well-annotated assembly in comparison to other insect genomes. The genome is unusually GC-rich (50%) compared to other insect genomes to date. The official gene set (OGS v1.0) contains 16,859 genes, of which ~ 10% were manually verified and corrected by our consortium. We focused on manual annotation, phylogenetic, and expression evidence analyses for gene sets centered on primary themes in the life histories and activities of plant-colonizing insects. Highlights include the following: (1) divergent clades and large expansions in genes associated with environmental sensing (chemosensory receptors) and detoxification (CYP4, CYP6, and CCE enzymes) of substances encountered in agricultural environments; (2) a comprehensive set of salivary gland genes supported by enriched expression; (3) apparent absence of members of the IMD innate immune defense pathway; and (4) developmental- and sex-specific expression analyses of genes associated with progression from larvae to adulthood through neometaboly, a distinct form of maturation differing from either incomplete or complete metamorphosis in the Insecta. Conclusions Analysis of the F. occidentalis genome offers insights into the polyphagous behavior of this insect pest that finds, colonizes, and survives on a widely diverse array of plants. The genomic resources presented here enable a more complete analysis of insect evolution and biology, providing a missing taxon for contemporary insect genomics-based analyses. Our study also offers a genomic benchmark for molecular and evolutionary investigations of other Thysanoptera species. 
    more » « less
  5. Abstract Zinc finger (Zf)-BED proteins are a novel superfamily of transcription factors that controls numerous activities in plants including growth, development, and cellular responses to biotic and abiotic stresses. Despite their important roles in gene regulation, little is known about the specific functions of Zf-BEDs in land plants. The current study identified a total of 750 Zf-BED-encoding genes in 35 land plant species including mosses, bryophytes, lycophytes, gymnosperms, and angiosperms. The gene family size was somewhat proportional to genome size. All identified genes were categorized into 22 classes based on their specific domain architectures. Of these, class I (Zf-BED_DUF-domain_Dimer_Tnp_hAT) was the most common in the majority of the land plants. However, some classes were family-specific, while the others were species-specific, demonstrating diversity at different classification levels. In addition, several novel functional domains were also predicated including WRKY and nucleotide-binding site (NBS). Comparative genomics, transcriptomics, and proteomics provided insights into the evolutionary history, duplication, divergence, gene gain and loss, species relationship, expression profiling, and structural diversity of Zf-BEDs in land plants. The comprehensive study of Zf-BEDs inGossypiumsp., (cotton) also demonstrated a clear footprint of polyploidization. Overall, this comprehensive evolutionary study of Zf-BEDs in land plants highlighted significant diversity among plant species. 
    more » « less