skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Evolutionary algorithms simulating molecular evolution: a new field proposal
Abstract The genetic blueprint for the essential functions of life is encoded in DNA, which is translated into proteins—the engines driving most of our metabolic processes. Recent advancements in genome sequencing have unveiled a vast diversity of protein families, but compared with the massive search space of all possible amino acid sequences, the set of known functional families is minimal. One could say nature has a limited protein ”vocabulary.” A major question for computational biologists, therefore, is whether this vocabulary can be expanded to include useful proteins that went extinct long ago or have never evolved (yet). By merging evolutionary algorithms, machine learning, and bioinformatics, we can develop highly customized ”designer proteins.” We dub the new subfield of computational evolution, which employs evolutionary algorithms with DNA string representations, biologically accurate molecular evolution, and bioinformatics-informed fitness functions, Evolutionary Algorithms Simulating Molecular Evolution.  more » « less
Award ID(s):
2335888
PAR ID:
10596365
Author(s) / Creator(s):
; ;
Publisher / Repository:
Briefings in Bioinformatics
Date Published:
Journal Name:
Briefings in Bioinformatics
Volume:
25
Issue:
5
ISSN:
1467-5463
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract The role of various interactions in determining the pressure adaptation of the proteome in piezophilic organisms remains to be established. It is clear that the adaptation is not limited to one or two proteins, but has a more general evolution of the characteristics of the entire proteome, the so-called cryptic evolution. Using the synergy between bioinformatics, computer simulations, and some experimental evidence, we probed the physico-chemical mechanisms of cryptic evolution of the proteome of psychrophilic strains of model organism,Colwellia, to adapt to life at various pressures, from the surface of the Arctic ice to the depth of the Mariana Trench. From the bioinformatics analysis of proteomes of several strains of Colwellia, we have identified the modulation of interactions between charged residues as a possible driver of evolutionary adaptation to high hydrostatic pressure. The computational modeling suggests that these interactions have different roles in modulating the function-stability relationship for different protein families. For several classes of proteins, the modulation of interactions between charges evolved to lead to an increase in stability with pressure, while for others, just the opposite is observed. The latter trend appears to benefit enzyme activity by countering structural rigidification due to the high pressure. 
    more » « less
  2. Abstract According to the Principle of Minimal Frustration, folded proteins can only have a minimal number of strong energetic conflicts in their native states. However, not all interactions are energetically optimized for folding but some remain in energetic conflict, i.e. they are highly frustrated. This remaining local energetic frustration has been shown to be statistically correlated with distinct functional aspects such as protein-protein interaction sites, allosterism and catalysis. Fuelled by the recent breakthroughs in efficient protein structure prediction that have made available good quality models for most proteins, we have developed a strategy to calculate local energetic frustration within large protein families and quantify its conservation over evolutionary time. Based on this evolutionary information we can identify how stability and functional constraints have appeared at the common ancestor of the family and have been maintained over the course of evolution. Here, we present FrustraEvo, a web server tool to calculate and quantify the conservation of local energetic frustration in protein families. 
    more » « less
  3. Abstract BackgroundIdentifying the DNA-binding specificities of transcription factors (TF) is central to understanding gene networks that regulate growth and development. Such knowledge is lacking in oomycetes, a microbial eukaryotic lineage within the stramenopile group. Oomycetes include many important plant and animal pathogens such as the potato and tomato blight agentPhytophthora infestans, which is a tractable model for studying life-stage differentiation within the group. ResultsMining of the P. infestans genome identified 197 genes encoding proteins belonging to 22 TF families. Their chromosomal distribution was consistent with family expansions through unequal crossing-over, which were likely ancient since each family had similar sizes in most oomycetes. Most TFs exhibited dynamic changes in RNA levels through the P. infestanslife cycle. The DNA-binding preferences of 123 proteins were assayed using protein-binding oligonucleotide microarrays, which succeeded with 73 proteins from 14 families. Binding sites predicted for representatives of the families were validated by electrophoretic mobility shift or chromatin immunoprecipitation assays. Consistent with the substantial evolutionary distance of oomycetes from traditional model organisms, only a subset of the DNA-binding preferences resembled those of human or plant orthologs. Phylogenetic analyses of the TF families withinP. infestansoften discriminated clades with canonical and novel DNA targets. Paralogs with similar binding preferences frequently had distinct patterns of expression suggestive of functional divergence. TFs were predicted to either drive life stage-specific expression or serve as general activators based on the representation of their binding sites within total or developmentally-regulated promoters. This projection was confirmed for one TF using synthetic and mutated promoters fused to reporter genesin vivo. ConclusionsWe established a large dataset of binding specificities forP. infestansTFs, representing the first in the stramenopile group. This resource provides a basis for understanding transcriptional regulation by linking TFs with their targets, which should help delineate the molecular components of processes such as sporulation and host infection. Our work also yielded insight into TF evolution during the eukaryotic radiation, revealing both functional conservation as well as diversification across kingdoms. 
    more » « less
  4. Abstract Protein stability is a major constraint on protein evolution. Molecular chaperones, also known as heat-shock proteins, can relax this constraint and promote protein evolution by diminishing the deleterious effect of mutations on protein stability and folding. This effect, however, has only been stablished for a few chaperones. Here, we use a comprehensive chaperone-protein interaction network to study the effect of all yeast chaperones on the evolution of their protein substrates, that is, their clients. In particular, we analyze how yeast chaperones affect the evolutionary rates of their clients at two very different evolutionary time scales. We first study the effect of chaperone-mediated folding on protein evolution over the evolutionary divergence of Saccharomyces cerevisiae and S. paradoxus. We then test whether yeast chaperones have left a similar signature on the patterns of standing genetic variation found in modern wild and domesticated strains of S. cerevisiae. We find that genes encoding chaperone clients have diverged faster than genes encoding nonclient proteins when controlling for their number of protein-protein interactions. We also find that genes encoding client proteins have accumulated more intra-specific genetic diversity than those encoding nonclient proteins. In a number of multivariate analyses, controlling by other well-known factors that affect protein evolution, we find that chaperone dependence explains the largest fraction of the observed variance in the rate of evolution at both evolutionary time scales. Chaperones affecting rates of protein evolution mostly belong to two major chaperone families: Hsp70s and Hsp90s. Our analyses show that protein chaperones, by virtue of their ability to buffer destabilizing mutations and their role in modulating protein genotype-phenotype maps, have a considerable accelerating effect on protein evolution. 
    more » « less
  5. ABSTRACT The broadly conserved ParB protein performs crucial functions in bacterial chromosome segregation and replication regulation. The cellular function of ParB requires it to dimerize, recognizeparSDNA sequences, clamp on DNA, then slide to adjacent sequences through nonspecific DNA binding. How ParB coordinates nonspecific DNA binding and sliding remains elusive. Here, we combine multiplein vitrobiophysical and computational tools andin vivoapproaches to address this question. We found that the five conserved lysine residues in the C-terminal domain of ParB play distinct roles in proper positioning and sliding on DNA, and their integrity is crucial for ParB’sin vivofunctions. Many proteins with diverse cellular activities need to move on DNA while loosely bound. Our findings reveal the detailed molecular mechanism by which multiple flexible basic residues enable DNA binding proteins to efficiently slide along DNA. 
    more » « less