skip to main content


Title: Large-scale discovery of protein interactions at residue resolution using co-evolution calculated from genomic sequences
Abstract

Increasing numbers of protein interactions have been identified in high-throughput experiments, but only a small proportion have solved structures. Recently, sequence coevolution-based approaches have led to a breakthrough in predicting monomer protein structures and protein interaction interfaces. Here, we address the challenges of large-scale interaction prediction at residue resolution with a fast alignment concatenation method and a probabilistic score for the interaction of residues. Importantly, this method (EVcomplex2) is able to assess the likelihood of a protein interaction, as we show here applied to large-scale experimental datasets where the pairwise interactions are unknown. We predict 504 interactions de novo in theE. colimembrane proteome, including 243 that are newly discovered. While EVcomplex2 does not require available structures, coevolving residue pairs can be used to produce structural models of protein interactions, as done here for membrane complexes including the Flagellar Hook-Filament Junction and the Tol/Pal complex.

 
more » « less
NSF-PAR ID:
10216081
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
Nature Publishing Group
Date Published:
Journal Name:
Nature Communications
Volume:
12
Issue:
1
ISSN:
2041-1723
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Plants capture and convert solar energy in a complex network of membrane proteins. Under high light, the luminal pH drops and induces a reorganization of the protein network, particularly clustering of the major light-harvesting complex (LHCII). While the structures of the network have been resolved in exquisite detail, the thermodynamics that control the assembly and reorganization had not been determined, largely because the interaction energies of membrane proteins have been inaccessible. Here, we describe a method to quantify these energies and its application to LHCII. Using single-molecule measurements, LHCII proteoliposomes, and statistical thermodynamic modeling, we quantified the LHCII-LHCII interaction energy as ~−5kBTat neutral pH and at least −7kBTat acidic pH. These values revealed an enthalpic thermodynamic driving force behind LHCII clustering. Collectively, this work captures the interactions that drive the organization of membrane protein networks from the perspective of equilibrium statistical thermodynamics, which has a long and rich tradition in biology.

     
    more » « less
  2. Kolodny, Rachel (Ed.)
    Crystallography and NMR system (CNS) is currently a widely used method for fragment-free ab initio protein folding from inter-residue distance or contact maps. Despite its widespread use in protein structure prediction, CNS is a decade-old macromolecular structure determination system that was originally developed for solving macromolecular geometry from experimental restraints as opposed to predictive modeling driven by interaction map data. As such, the adaptation of the CNS experimental structure determination protocol for ab initio protein folding is intrinsically anomalous that may undermine the folding accuracy of computational protein structure prediction. In this paper, we propose a new CNS-free hierarchical structure modeling method called DConStruct for folding both soluble and membrane proteins driven by distance and contact information. Rigorous experimental validation shows that DConStruct attains much better reconstruction accuracy than CNS when tested with the same input contact map at varying contact thresholds. The hierarchical modeling with iterative self-correction employed in DConStruct scales at a much higher degree of folding accuracy than CNS with the increase in contact thresholds, ultimately approaching near-optimal reconstruction accuracy at higher-thresholded contact maps. The folding accuracy of DConStruct can be further improved by exploiting distance-based hybrid interaction maps at tri-level thresholding, as demonstrated by the better performance of our method in folding free modeling targets from the 12th and 13th rounds of the Critical Assessment of techniques for protein Structure Prediction (CASP) experiments compared to popular CNS- and fragment-based approaches and energy-minimization protocols, some of which even using much finer-grained distance maps than ours. Additional large-scale benchmarking shows that DConStruct can significantly improve the folding accuracy of membrane proteins compared to a CNS-based approach. These results collectively demonstrate the feasibility of greatly improving the accuracy of ab initio protein folding by optimally exploiting the information encoded in inter-residue interaction maps beyond what is possible by CNS. 
    more » « less
  3. Abstract

    Cristae are high‐curvature structures in the inner mitochondrial membrane (IMM) that are crucial for ATP production. While cristae‐shaping proteins have been defined, analogous lipid‐based mechanisms have yet to be elucidated. Here, we combine experimental lipidome dissection with multi‐scale modeling to investigate how lipid interactions dictate IMM morphology and ATP generation. When modulating phospholipid (PL) saturation in engineered yeast strains, we observed a surprisingly abrupt breakpoint in IMM topology driven by a continuous loss of ATP synthase organization at cristae ridges. We found that cardiolipin (CL) specifically buffers the inner mitochondrial membrane against curvature loss, an effect that is independent of ATP synthase dimerization. To explain this interaction, we developed a continuum model for cristae tubule formation that integrates both lipid and protein‐mediated curvatures. This model highlighted a snapthrough instability, which drives IMM collapse upon small changes in membrane properties. We also showed that cardiolipin is essential in low‐oxygen conditions that promote PL saturation. These results demonstrate that the mechanical function of cardiolipin is dependent on the surrounding lipid and protein components of the IMM.

     
    more » « less
  4. Abstract

    Bacterial extracellular vesicles (BEVs), including outer membrane vesicles, have emerged as a promising new class of vaccines and therapeutics to treat cancer and inflammatory diseases, among other applications. However, clinical translation of BEVs is hindered by a current lack of scalable and efficient purification methods. Here, we address downstream BEV biomanufacturing limitations by developing a method for orthogonal size‐ and charge‐based BEV enrichment using tangential flow filtration (TFF) in tandem with high performance anion exchange chromatography (HPAEC). The data show that size‐based separation coisolated protein contaminants, whereas size‐based TFF with charged‐based HPAEC dramatically improved purity of BEVs produced by probiotic Gram‐negativeEscherichia coliand Gram‐positive lactic acid bacteria (LAB).Escherichia coliBEV purity was quantified using established biochemical markers while improved LAB BEV purity was assessed via observed potentiation of anti‐inflammatory bioactivity. Overall, this work establishes orthogonal TFF + HPAEC as a scalable and efficient method for BEV purification that holds promise for future large‐scale biomanufacturing of therapeutic BEV products.

     
    more » « less
  5. null (Ed.)
    The relation between amino acid (AA) sequence and biologically active conformation controls the process of polypeptide chains folding into three-dimensional (3d) protein structures. The recent achievements in the resolution achieved in cryo-electron microscopy coupled with improvements in computational methodologies have accelerated the analysis of structures and properties of proteins. However, the detailed interaction between AAs has not been fully elucidated. Herein, we present a de novo method to evaluate inter-amino acid interactions based on the concept of accurately evaluating the amino acid bond pairs (AABP). The results obtained enabled the identification of complex 3d long-range interconnected AA interacting network in proteins. The method is applied to the receptor binding domain (RBD) of the SARS-CoV-2 spike protein. We show that although nearest-neighbor AAs in the primary sequence have large AABP, other nonlocal AAs make substantial contribution to AABP with significant participation of both covalent and hydrogen bonding. Detailed analysis of AABP in RBD reveals the pivotal role they play in sequence conservation with profound implications on residue mutations and for therapeutic drug design. This approach could be easily applied to many other proteins of biomedical interest in life sciences. 
    more » « less