skip to main content


Title: A putative de novo evolved gene required for spermatid chromatin condensation in Drosophila melanogaster
Comparative genomics has enabled the identification of genes that potentially evolved de novo from non-coding sequences. Many such genes are expressed in male reproductive tissues, but their functions remain poorly understood. To address this, we conducted a functional genetic screen of over 40 putative de novo genes with testis-enriched expression in Drosophila melanogaster and identified one gene, atlas , required for male fertility. Detailed genetic and cytological analyses showed that atlas is required for proper chromatin condensation during the final stages of spermatogenesis. Atlas protein is expressed in spermatid nuclei and facilitates the transition from histone- to protamine-based chromatin packaging. Complementary evolutionary analyses revealed the complex evolutionary history of atlas . The protein-coding portion of the gene likely arose at the base of the Drosophila genus on the X chromosome but was unlikely to be essential, as it was then lost in several independent lineages. Within the last ~15 million years, however, the gene moved to an autosome, where it fused with a conserved non-coding RNA and evolved a non-redundant role in male fertility. Altogether, this study provides insight into the integration of novel genes into biological processes, the links between genomic innovation and functional evolution, and the genetic control of a fundamental developmental process, gametogenesis.  more » « less
Award ID(s):
1652013
NSF-PAR ID:
10298481
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ;
Editor(s):
Malik, Harmit S.
Date Published:
Journal Name:
PLOS Genetics
Volume:
17
Issue:
9
ISSN:
1553-7404
Page Range / eLocation ID:
e1009787
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Abstract Comparative genomic studies have repeatedly shown that new protein-coding genes can emerge de novo from noncoding DNA. Still unknown is how and when the structures of encoded de novo proteins emerge and evolve. Combining biochemical, genetic and evolutionary analyses, we elucidate the function and structure of goddard , a gene which appears to have evolved de novo at least 50 million years ago within the Drosophila genus. Previous studies found that goddard is required for male fertility. Here, we show that Goddard protein localizes to elongating sperm axonemes and that in its absence, elongated spermatids fail to undergo individualization. Combining modelling, NMR and circular dichroism (CD) data, we show that Goddard protein contains a large central α -helix, but is otherwise partially disordered. We find similar results for Goddard’s orthologs from divergent fly species and their reconstructed ancestral sequences. Accordingly, Goddard’s structure appears to have been maintained with only minor changes over millions of years. 
    more » « less
  2. New genes arise through a variety of mechanisms, including the duplication of existing genes and the de novo birth of genes from noncoding DNA sequences. While there are numerous examples of duplicated genes with important func- tional roles, the functions of de novo genes remain largely unexplored. Many newly evolved genes are expressed in the male reproductive tract, suggesting that these evolutionary innovations may provide advantages to males experiencing sexual selection. Using testis-specific RNA interference, we screened 11 putative de novo genes in Drosophila mela- nogaster for effects on male fertility and identified two, goddard and saturn, that are essential for spermatogenesis and sperm function. Goddard knockdown (KD) males fail to produce mature sperm, while saturn KD males produce few sperm, and these function inefficiently once transferred to females. Consistent with a de novo origin, both genes are identifiable only in Drosophila and are predicted to encode proteins with no sequence similarity to any annotated protein. However, since high levels of divergence prevented the unambiguous identification of the noncoding sequences from which each gene arose, we consider goddard and saturn to be putative de novo genes. Within Drosophila, both genes have been lost in certain lineages, but show conserved, male-specific patterns of expression in the species in which they are found. Goddard is consistently found in single-copy and evolves under purifying selection. In contrast, saturn has diversified through gene duplication and positive selection. These data suggest that de novo genes can acquire essential roles in male reproduction. 
    more » « less
  3. Félix, M -A (Ed.)
    Abstract Plectus murrayi is one of the most common and locally abundant invertebrates of continental Antarctic ecosystems. Because it is readily cultured on artificial medium in the laboratory and highly tolerant to an extremely harsh environment, P. murrayi is emerging as a model organism for understanding the evolutionary origin and maintenance of adaptive responses to multiple environmental stressors, including freezing and desiccation. The de novo assembled genome of P. murrayi contains 225.741 million base pairs and a total of 14,689 predicted genes. Compared to Caenorhabditis elegans, the architectural components of P. murrayi are characterized by a lower number of protein-coding genes, fewer transposable elements, but more exons, than closely related taxa from less harsh environments. We compared the transcriptomes of lab-reared P. murrayi with wild-caught P. murrayi and found genes involved in growth and cellular processing were up-regulated in lab-cultured P. murrayi, while a few genes associated with cellular metabolism and freeze tolerance were expressed at relatively lower levels. Preliminary comparative genomic and transcriptomic analyses suggest that the observed constraints on P. murrayi genome architecture and functional gene expression, including genome decay and intron retention, may be an adaptive response to persisting in a biotically simplified, yet consistently physically harsh environment. 
    more » « less
  4. Abstract Motivation

    Genetic variation that disrupts gene function by altering gene splicing between individuals can substantially influence traits and disease. In those cases, accurately predicting the effects of genetic variation on splicing can be highly valuable for investigating the mechanisms underlying those traits and diseases. While methods have been developed to generate high quality computational predictions of gene structures in reference genomes, the same methods perform poorly when used to predict the potentially deleterious effects of genetic changes that alter gene splicing between individuals. Underlying that discrepancy in predictive ability are the common assumptions by reference gene finding algorithms that genes are conserved, well-formed and produce functional proteins.

    Results

    We describe a probabilistic approach for predicting recent changes to gene structure that may or may not conserve function. The model is applicable to both coding and non-coding genes, and can be trained on existing gene annotations without requiring curated examples of aberrant splicing. We apply this model to the problem of predicting altered splicing patterns in the genomes of individual humans, and we demonstrate that performing gene-structure prediction without relying on conserved coding features is feasible. The model predicts an unexpected abundance of variants that create de novo splice sites, an observation supported by both simulations and empirical data from RNA-seq experiments. While these de novo splice variants are commonly misinterpreted by other tools as coding or non-coding variants of little or no effect, we find that in some cases they can have large effects on splicing activity and protein products and we propose that they may commonly act as cryptic factors in disease.

    Availability and implementation

    The software is available from geneprediction.org/SGRF.

    Supplementary information

    Supplementary information is available at Bioinformatics online.

     
    more » « less
  5. Abstract

    Epigenetic variations contribute greatly to the phenotypic plasticity and diversity. Current functional studies on epialleles have predominantly focused on protein-coding genes, leaving the epialleles of non-coding RNA (ncRNA) genes largely understudied. Here, we uncover abundant DNA methylation variations of ncRNA genes and their significant correlations with plant adaptation among 1001 naturalArabidopsisaccessions. Through genome-wide association study (GWAS), we identify large numbers of methylation QTL (methylQTL) that are independent of known DNA methyltransferases and enriched in specific chromatin states. Proximal methylQTL closely located to ncRNA genes have a larger effect on DNA methylation than distal methylQTL. We ectopically tether a DNA methyltransferase MQ1v to miR157a by CRISPR-dCas9 and show de novo establishment of DNA methylation accompanied with decreased miR157a abundance and early flowering. These findings provide important insights into the genetic basis of epigenetic variations and highlight the contribution of epigenetic variations of ncRNA genes to plant phenotypes and diversity.

     
    more » « less