skip to main content


Title: Evolution of networks of protein domain organization
Abstract

Domains are the structural, functional and evolutionary units of proteins. They combine to form multidomain proteins. The evolutionary history of this molecular combinatorics has been studied with phylogenomic methods. Here, we construct networks of domain organization and explore their evolution. A time series of networks revealed two ancient waves of structural novelty arising from ancient ‘p-loop’ and ‘winged helix’ domains and a massive ‘big bang’ of domain organization. The evolutionary recruitment of domains was highly modular, hierarchical and ongoing. Domain rearrangements elicited non-random and scale-free network structure. Comparative analyses of preferential attachment, randomness and modularity showed yin-and-yang complementary transition and biphasic patterns along the structural chronology. Remarkably, the evolving networks highlighted a central evolutionary role of cofactor-supporting structures of non-ribosomal peptide synthesis pathways, likely crucial to the early development of the genetic code. Some highly modular domains featured dual response regulation in two-component signal transduction systems with DNA-binding activity linked to transcriptional regulation of responses to environmental change. Interestingly, hub domains across the evolving networks shared the historical role of DNA binding and editing, an ancient protein function in molecular evolution. Our investigation unfolds historical source-sink patterns of evolutionary recruitment that further our understanding of protein architectures and functions.

 
more » « less
NSF-PAR ID:
10246676
Author(s) / Creator(s):
;
Publisher / Repository:
Nature Publishing Group
Date Published:
Journal Name:
Scientific Reports
Volume:
11
Issue:
1
ISSN:
2045-2322
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    The structures and functions of proteins are embedded into the loop scaffolds of structural domains. Their origin and evolution remain mysterious. Here, we use a novel graph-theoretical approach to describe how modular and non-modular loop prototypes combine to form folded structures in protein domain evolution. Phylogenomic data-driven chronologies reoriented a bipartite network of loops and domains (and its projections) into ‘waterfalls’ depicting an evolving ‘elementary functionome’ (EF). Two primordial waves of functional innovation involving founder ‘p-loop’ and ‘winged-helix’ domains were accompanied by an ongoing emergence and reuse of structural and functional novelty. Metabolic pathways expanded before translation functionalities. A dual hourglass recruitment pattern transferred scale-free properties from loop to domain components of the EF network in generative cycles of hierarchical modularity. Modeling the evolutionary emergence of the oldest P-loop and winged-helix domains with AlphFold2 uncovered rapid convergence towards folded structure, suggesting that a folding vocabulary exists in loops for protein fold repurposing and design.

     
    more » « less
  2. Abstract

    Immunoglobulin Binding Protein (BiP) is a chaperone and molecular motor belonging to the Hsp70 family, involved in the regulation of important biological processes such as synthesis, folding and translocation of proteins in the Endoplasmic Reticulum. BiP has two highly conserved domains: the N‐terminal Nucleotide‐Binding Domain (NBD), and the C‐terminal Substrate‐Binding Domain (SBD), connected by a hydrophobic linker. ATP binds and it is hydrolyzed to ADP in the NBD, and BiP's extended polypeptide substrates bind in the SBD. Like many molecular motors, BiP function depends on both structural and catalytic properties that may contribute to its performance. One novel approach to study the mechanical properties of BiP considers exploring the changes in the viscoelastic behavior upon ligand binding, using a technique called nano‐rheology. This technique is essentially a traditional rheology experiment, in which an oscillatory force is directly applied to the protein under study, and the resulting average deformation is measured. Our results show that the folded state of the protein behaves like a viscoelastic material, getting softer when it binds nucleotides‐ ATP, ADP, and AMP‐PNP‐, but stiffer when binding HTFPAVL peptide substrate. Also, we observed that peptide binding dramatically increases the affinity for ADP, decreasing it dissociation constant (KD) around 1000 times, demonstrating allosteric coupling between SBD and NBD domains.

     
    more » « less
  3. Larracuente, Amanda (Ed.)
    Abstract The methyltransferase like (METTL) proteins constitute a family of seven-beta-strand methyltransferases with S-adenosyl methionine binding domains that modify DNA, RNA, and proteins. Methylation by METTL proteins contributes to the epigenetic, and in the case of RNA modifications, epitranscriptomic regulation of a variety of biological processes. Despite their functional importance, most investigations of the substrates and functions of METTLs within metazoans have been restricted to model vertebrate taxa. In the present work, we explore the evolutionary mechanisms driving the diversification and functional differentiation of 33 individual METTL proteins across Metazoa. Our results show that METTLs are nearly ubiquitous across the animal kingdom, with most having arisen early in metazoan evolution (i.e., occur in basal metazoan phyla). Individual METTL lineages each originated from single independent ancestors, constituting monophyletic clades, which suggests that each METTL was subject to strong selective constraints driving its structural and/or functional specialization. Interestingly, a similar process did not extend to the differentiation of nucleoside-modifying and protein-modifying METTLs (i.e., each METTL type did not form a unique monophyletic clade). The members of these two types of METTLs also exhibited differences in their rates of evolution. Overall, we provide evidence that the long-term evolution of METTL family members was driven by strong purifying selection, which in combination with adaptive selection episodes, led to the functional specialization of individual METTL lineages. This work contributes useful information regarding the evolution of a gene family that fulfills a variety of epigenetic functions, and can have profound influences on molecular processes and phenotypic traits. 
    more » « less
  4. Abstract

    Structure and functions of S100 proteins are regulated by two distinct calcium binding EF hand motifs. In this work, we used solution‐state NMR spectroscopy to investigate the cooperativity between the two calcium binding sites and map the allosteric changes at the target binding site. To parse the contribution of the individual calcium binding events, variants of S100A12 were designed to selectively bind calcium to either the EF‐I (N63A) or EF‐II (E31A) loop, respectively. Detailed analysis of the backbone chemical shifts for wildtype protein and its mutants indicates that calcium binding to the canonical EF‐II loop is the principal trigger for the conformational switch between ‘closed’ apo to the ‘open’ Ca2+‐bound conformation of the protein. Elimination of binding in S100‐specific EF‐I loop has limited impact on the calcium binding affinity of the EF‐II loop and the concomitant structural rearrangement. In contrast, deletion of binding in the EF‐II loop significantly attenuates calcium affinity in the EF‐I loop and the structure adopts a ‘closed’ apo‐like conformation. Analysis of experimental amide nitrogen (15N) relaxation rates (R1,R2, and15N–{1H} NOE) and molecular dynamics (MD) simulations demonstrate that the calcium bound state is relatively floppy with pico–nanosecond motions induced in functionally relevant domains responsible for target recognition such as the hinge domain and the C‐terminal residues. Experimental relaxation studies combined with MD simulations show that while calcium binding in the EF‐I loop alone does not induce significant motions in the polypeptide chain, EF‐I regulates fluctuations in the polypeptide in the presence of bound calcium in the EF‐II loop. These results offer novel insights into the dynamic regulation of target recognition by calcium binding and unravels the role of cooperativity between the two calcium binding events in S100A12.

     
    more » « less
  5. Abstract

    Identification of the molecular networks that facilitated the evolution of multicellular animals from their unicellular ancestors is a fundamental problem in evolutionary cellular biology. Choanoflagellates are recognized as the closest extant nonmetazoan ancestors to animals. These unicellular eukaryotes can adopt a multicellular‐like “rosette” state. Therefore, they are compelling models for the study of early multicellularity. Comparative studies revealed that a number of putative human orthologs are present in choanoflagellate genomes, suggesting that a subset of these genes were necessary for the emergence of multicellularity. However, previous work is largely based on sequence alignments alone, which does not confirm structural nor functional similarity. Here, we focus on the PDZ domain, a peptide‐binding domain which plays critical roles in myriad cellular signaling networks and which underwent a gene family expansion in metazoan lineages. Using a customized sequence similarity search algorithm, we identified 178 PDZ domains in theMonosiga brevicollisproteome. This includes 11 previously unidentified sequences, which we analyzed using Rosetta and homology modeling. To assess conservation of protein structure, we solved high‐resolution crystal structures of representativeM. brevicollisPDZ domains that are homologous to human Dlg1 PDZ2, Dlg1 PDZ3, GIPC, and SHANK1 PDZ domains. To assess functional conservation, we calculated binding affinities for mbGIPC, mbSHANK1, mbSNX27, and mbDLG‐3 PDZ domains fromM. brevicollis. Overall, we find that peptide selectivity is generally conserved between these two disparate organisms, with one possible exception, mbDLG‐3. Overall, our results provide novel insight into signaling pathways in a choanoflagellate model of primitive multicellularity.

     
    more » « less