skip to main content

Search for: All records

Creators/Authors contains: "Bowman, Gregory R."

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract

    Protein-protein and protein-nucleic acid interactions are often considered difficult drug targets because the surfaces involved lack obvious druggable pockets. Cryptic pockets could present opportunities for targeting these interactions, but identifying and exploiting these pockets remains challenging. Here, we apply a general pipeline for identifying cryptic pockets to the interferon inhibitory domain (IID) of Ebola virus viral protein 35 (VP35). VP35 plays multiple essential roles in Ebola’s replication cycle but lacks pockets that present obvious utility for drug design. Using adaptive sampling simulations and machine learning algorithms, we predict VP35 harbors a cryptic pocket that is allosterically coupled to a key dsRNA-binding interface. Thiol labeling experiments corroborate the predicted pocket and mutating the predicted allosteric network supports our model of allostery. Finally, covalent modifications that mimic drug binding allosterically disrupt dsRNA binding that is essential for immune evasion. Based on these results, we expect this pipeline will be applicable to other proteins.

    more » « less
  2. Abstract

    Understanding the structural determinants of a protein’s biochemical properties, such as activity and stability, is a major challenge in biology and medicine. Comparing computer simulations of protein variants with different biochemical properties is an increasingly powerful means to drive progress. However, success often hinges on dimensionality reduction algorithms for simplifying the complex ensemble of structures each variant adopts. Unfortunately, common algorithms rely on potentially misleading assumptions about what structural features are important, such as emphasizing larger geometric changes over smaller ones. Here we present DiffNets, self-supervised autoencoders that avoid such assumptions, and automatically identify the relevant features, by requiring that the low-dimensional representations they learn are sufficient to predict the biochemical differences between protein variants. For example, DiffNets automatically identify subtle structural signatures that predict the relative stabilities of β-lactamase variants and duty ratios of myosin isoforms. DiffNets should also be applicable to understanding other perturbations, such as ligand binding.

    more » « less
  3. null (Ed.)
  4. Equilibria, or fixed points, play an important role in dynamical systems across various domains, yet finding them can be computationally challenging. Here, we show how to efficiently compute all equilibrium points of discrete-valued, discrete-time systems on sparse networks. Using graph partitioning, we recursively decompose the original problem into a set of smaller, simpler problems that are easy to compute, and whose solutions combine to yield the full equilibrium set. This makes it possible to find the fixed points of systems on arbitrarily large networks meeting certain criteria. This approach can also be used without computing the full equilibrium set, which may grow very large in some cases. For example, one can use this method to check the existence and total number of equilibria, or to find equilibria that are optimal with respect to a given cost function. We demonstrate the potential capabilities of this approach with examples in two scientific domains: computing the number of fixed points in brain networks and finding the minimal energy conformations of lattice-based protein folding models. 
    more » « less
  5. null (Ed.)
    Viruses such as the novel coronavirus, SARS-CoV-2, that is wreaking havoc on the world, depend on interactions of its own proteins with those of the human host cells. Relatively small changes in sequence such as between SARS-CoV and SARS-CoV-2 can dramatically change clinical phenotypes of the virus, including transmission rates and severity of the disease. On the other hand, highly dissimilar virus families such as Coronaviridae, Ebola, and HIV have overlap in functions. In this work we aim to analyze the role of protein sequence in the binding of SARS-CoV-2 virus proteins towards human proteins and compare it to that of the above other viruses. We build supervised machine learning models, using Generalized Additive Models to predict interactions based on sequence features and find that our models perform well with an AUC-PR of 0.65 in a class-skew of 1:10. Analysis of the novel predictions using an independent dataset showed statistically significant enrichment. We further map the importance of specific amino-acid sequence features in predicting binding and summarize what combinations of sequences from the virus and the host is correlated with an interaction. By analyzing the sequence-based embeddings of the interactomes from different viruses and clustering them together we find some functionally similar proteins from different viruses. For example, vif protein from HIV-1, vp24 from Ebola and orf3b from SARS-CoV all function as interferon antagonists. Furthermore, we can differentiate the functions of similar viruses, for example orf3a’s interactions are more diverged than orf7b interactions when comparing SARS-CoV and SARS-CoV-2. 
    more » « less
  6. Abstract

    The SARS-CoV-2 nucleocapsid (N) protein is an abundant RNA-binding protein critical for viral genome packaging, yet the molecular details that underlie this process are poorly understood. Here we combine single-molecule spectroscopy with all-atom simulations to uncover the molecular details that contribute to N protein function. N protein contains three dynamic disordered regions that house putative transiently-helical binding motifs. The two folded domains interact minimally such that full-length N protein is a flexible and multivalent RNA-binding protein. N protein also undergoes liquid-liquid phase separation when mixed with RNA, and polymer theory predicts that the same multivalent interactions that drive phase separation also engender RNA compaction. We offer a simple symmetry-breaking model that provides a plausible route through which single-genome condensation preferentially occurs over phase separation, suggesting that phase separation offers a convenient macroscopic readout of a key nanoscopic interaction.

    more » « less