skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Promoter recruitment drives the emergence of proto-genes in a long-term evolution experiment with Escherichia coli
The phenomenon of de novo gene birth—the emergence of genes from non-genic sequences—has received considerable attention due to the widespread occurrence of genes that are unique to particular species or genomes. Most instances of de novo gene birth have been recognized through comparative analyses of genome sequences in eukaryotes, despite the abundance of novel, lineage-specific genes in bacteria and the relative ease with which bacteria can be studied in an experimental context. Here, we explore the genetic record of the Escherichia coli long-term evolution experiment (LTEE) for changes indicative of “proto-genic” phases of new gene birth in which non-genic sequences evolve stable transcription and/or translation. Over the time span of the LTEE, non-genic regions are frequently transcribed, translated and differentially expressed, with levels of transcription across low-expressed regions increasing in later generations of the experiment. Proto-genes formed downstream of new mutations result either from insertion element activity or chromosomal translocations that fused preexisting regulatory sequences to regions that were not expressed in the LTEE ancestor. Additionally, we identified instances of proto-gene emergence in which a previously unexpressed sequence was transcribed after formation of an upstream promoter, although such cases were rare compared to those caused by recruitment of preexisting promoters. Tracing the origin of the causative mutations, we discovered that most occurred early in the history of the LTEE, often within the first 20,000 generations, and became fixed soon after emergence. Our findings show that proto-genes emerge frequently within evolving populations, can persist stably, and can serve as potential substrates for new gene formation.  more » « less
Award ID(s):
1951307
PAR ID:
10532238
Author(s) / Creator(s):
; ; ;
Editor(s):
McLysaght, Aoife
Publisher / Repository:
Public Library of Science
Date Published:
Journal Name:
PLOS Biology
Volume:
22
Issue:
5
ISSN:
1545-7885
Page Range / eLocation ID:
e3002418
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The rate and spectrum of somatic mutations can diverge from that of germline mutations. This is because somatic tissues experience different mutagenic processes than germline tissues. Here, we use nanorate sequencing (NanoSeq) to identify somatic mutations in Arabidopsis shoots with high sensitivity. We report a somatic mutation rate of 3.6x10^-8 mutations/bp, ~4-5x the germline mutation rate. Somatic mutations displayed elevated signatures consistent with oxidative damage, UV damage, and transcription-coupled nucleotide excision repair. Both somatic and germline mutations were enriched in transposable elements and depleted in genes, but this depletion was greater in germline mutations. Somatic mutation rate correlated with proximity to the centromere, DNA methylation, chromatin accessibility, and gene/TE content, properties which were also largely true of germline mutations. We note DNA methylation and chromatin accessibility have different predicted effects on mutation rate for genic and non-genic regions; DNA methylation associates with a greater increase in mutation rate when in non-genic regions, and accessible chromatin associates with a lower mutation rate in non-genic regions but a higher mutation rate in genic regions. Together, these results characterize key differences and similarities in the genomic distribution of somatic and germline mutations. 
    more » « less
  2. New genes arise through a variety of mechanisms, including the duplication of existing genes and the de novo birth of genes from noncoding DNA sequences. While there are numerous examples of duplicated genes with important func- tional roles, the functions of de novo genes remain largely unexplored. Many newly evolved genes are expressed in the male reproductive tract, suggesting that these evolutionary innovations may provide advantages to males experiencing sexual selection. Using testis-specific RNA interference, we screened 11 putative de novo genes in Drosophila mela- nogaster for effects on male fertility and identified two, goddard and saturn, that are essential for spermatogenesis and sperm function. Goddard knockdown (KD) males fail to produce mature sperm, while saturn KD males produce few sperm, and these function inefficiently once transferred to females. Consistent with a de novo origin, both genes are identifiable only in Drosophila and are predicted to encode proteins with no sequence similarity to any annotated protein. However, since high levels of divergence prevented the unambiguous identification of the noncoding sequences from which each gene arose, we consider goddard and saturn to be putative de novo genes. Within Drosophila, both genes have been lost in certain lineages, but show conserved, male-specific patterns of expression in the species in which they are found. Goddard is consistently found in single-copy and evolves under purifying selection. In contrast, saturn has diversified through gene duplication and positive selection. These data suggest that de novo genes can acquire essential roles in male reproduction. 
    more » « less
  3. Experimental evolution is an approach that allows researchers to study organisms as they evolve in controlled environments. Despite the growing popularity of this approach, there are conceptual gaps among projects that use different experimental designs. One such gap concerns the contributions to adaptation of genetic variation present at the start of an experiment and that of new mutations that arise during an experiment. The primary source of genetic variation has historically depended largely on the study organisms. In the long-term evolution experiment (LTEE) using Escherichia coli , for example, each population started from a single haploid cell, and therefore, adaptation depended entirely on new mutations. Most other microbial evolution experiments have followed the same strategy. By contrast, evolution experiments using multicellular, sexually reproducing organisms typically start with preexisting variation that fuels the response to selection. New mutations may also come into play in later generations of these experiments, but it is generally difficult to quantify their contribution in these studies. Here, we performed an experiment using E. coli to compare the contributions of initial genetic variation and new mutations to adaptation in a new environment. Our experiment had four treatments that varied in their starting diversity, with 18 populations in each treatment. One treatment depended entirely on new mutations, while the other three began with mixtures of clones, whole-population samples, or mixtures of whole-population samples from the LTEE. We tracked a genetic marker associated with different founders in two treatments. These data revealed significant variation in fitness among the founders, and that variation impacted evolution in the early generations of our experiment. However, there were no differences in fitness among the treatments after 500 or 2,000 generations in the new environment, despite the variation in fitness among the founders. These results indicate that new mutations quickly dominated, and eventually they contributed more to adaptation than did the initial variation. Our study thus shows that preexisting genetic variation can have a strong impact on early evolution in a new environment, but new beneficial mutations may contribute more to later evolution and can even drive some initially beneficial variants to extinction. 
    more » « less
  4. Malik, Harmit S. (Ed.)
    Comparative genomics has enabled the identification of genes that potentially evolved de novo from non-coding sequences. Many such genes are expressed in male reproductive tissues, but their functions remain poorly understood. To address this, we conducted a functional genetic screen of over 40 putative de novo genes with testis-enriched expression in Drosophila melanogaster and identified one gene, atlas , required for male fertility. Detailed genetic and cytological analyses showed that atlas is required for proper chromatin condensation during the final stages of spermatogenesis. Atlas protein is expressed in spermatid nuclei and facilitates the transition from histone- to protamine-based chromatin packaging. Complementary evolutionary analyses revealed the complex evolutionary history of atlas . The protein-coding portion of the gene likely arose at the base of the Drosophila genus on the X chromosome but was unlikely to be essential, as it was then lost in several independent lineages. Within the last ~15 million years, however, the gene moved to an autosome, where it fused with a conserved non-coding RNA and evolved a non-redundant role in male fertility. Altogether, this study provides insight into the integration of novel genes into biological processes, the links between genomic innovation and functional evolution, and the genetic control of a fundamental developmental process, gametogenesis. 
    more » « less
  5. Ma, Li-Jun (Ed.)
    Abstract By introducing novel capacities and functions, new genes and gene families may play a crucial role in ecological transitions. Mechanisms generating new gene families include de novo gene birth, horizontal gene transfer, and neofunctionalization following a duplication event. The ectomycorrhizal (ECM) symbiosis is a ubiquitous mutualism and the association has evolved repeatedly and independently many times among the fungi, but the evolutionary dynamics enabling its emergence remain elusive. We developed a phylogenetic workflow to first understand if gene families unique to ECM Amanita fungi and absent from closely related asymbiotic species are functionally relevant to the symbiosis, and then to systematically infer their origins. We identified 109 gene families unique to ECM Amanita species. Genes belonging to unique gene families are under strong purifying selection and are upregulated during symbiosis, compared with genes of conserved or orphan gene families. The origins of seven of the unique gene families are strongly supported as either de novo gene birth (two gene families), horizontal gene transfer (four), or gene duplication (one). An additional 34 families appear new because of their selective retention within symbiotic species. Among the 109 unique gene families, the most upregulated gene in symbiotic cultures encodes a 1-aminocyclopropane-1-carboxylate deaminase, an enzyme capable of downregulating the synthesis of the plant hormone ethylene, a common negative regulator of plant-microbial mutualisms. 
    more » « less