skip to main content


Title: Could network structures generated with simple rules imposed on a cubic lattice reproduce the structural descriptors of globular proteins?
Abstract A direct way to spot structural features that are universally shared among proteins is to find analogues from simpler condensed matter systems. In the current study, the feasibility of creating ensembles of artificial structures that can automatically reproduce a large number of geometrical and topological descriptors of globular proteins is investigated. Towards this aim, a simple cubic (SC) arrangement is shown to provide the best background lattice after a careful analysis of the residue packing trends from 210 globular proteins. It is shown that a minimalistic set of rules imposed on this lattice is sufficient to generate structures that can mimic real proteins. In the proposed method, 210 such structures are generated by randomly removing residues (beads) from clusters that have a SC lattice arrangement such that all the generated structures have single connected components. Two additional sets are prepared from the initial structures via random relaxation and a reverse Monte Carlo simulated annealing algorithm, which targets the average radial distribution function (RDF) of 210 globular proteins. The initial and relaxed structures are compared to real proteins via RDF, bond orientational order parameters and several descriptors of network topology. Based on these features, results indicate that the structures generated with 40% occupancy closely resemble real residue networks. The structure generation mechanism automatically produces networks that are in the same topological class as globular proteins and reproduce small-world characteristics of high clustering and small shortest path lengths. Most notably, the established correspondence rules out icosahedral order as a relevant structural feature for residue networks in contrast to other amorphous systems where it is an inherent characteristic. The close correspondence is also observed in the vibrational characteristics as computed from the Anisotropic Network Model, therefore hinting at a non-superficial link between the proteins and the defect laden cubic crystalline order.  more » « less
Award ID(s):
1825254
NSF-PAR ID:
10312846
Author(s) / Creator(s):
; ; ; ;
Editor(s):
Estrada, Ernesto
Date Published:
Journal Name:
Journal of Complex Networks
Volume:
10
Issue:
1
ISSN:
2051-1310
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    The recent breakthroughs in structure prediction, where methods such as AlphaFold demonstrated near‐atomic accuracy, herald a paradigm shift in structural biology. The 200 million high‐accuracy models released in the AlphaFold Database are expected to guide protein science in the coming decades. Partitioning these AlphaFold models into domains and assigning them to an evolutionary hierarchy provide an efficient way to gain functional insights into proteins. However, classifying such a large number of predicted structures challenges the infrastructure of current structure classifications, including our Evolutionary Classification of protein Domains (ECOD). Better computational tools are urgently needed to parse and classify domains from AlphaFold models automatically. Here we present a Domain Parser for AlphaFold Models (DPAM) that can automatically recognize globular domains from these models based on inter‐residue distances in 3D structures, predicted aligned errors, and ECOD domains found by sequence (HHsuite) and structural (Dali) similarity searches. Based on a benchmark of 18,759 AlphaFold models, we demonstrate that DPAM can recognize 98.8% of domains and assign correct boundaries for 87.5%, significantly outperforming structure‐based domain parsers and homology‐based domain assignment using ECOD domains found by HHsuite or Dali. Application of DPAM to the massive AlphaFold models will enable efficient classification of domains, providing evolutionary contexts and facilitating functional studies.

     
    more » « less
  2. Abstract

    Cellular studies indicate that endocannabinoid type‐1 retrograde signaling plays a major role in synaptic plasticity. Disruption of these processes by delta‐9‐tetrahydrocannabinol (THC) could produce alterations either in structural and functional brain connectivity or in their association in cannabis (CB) users. Graph theoretic structural and functional networks were generated with diffusion tensor imaging and resting‐state functional imaging in 37 current CB users and 31 healthy non‐users. The primary outcome measures were coupling between structural and functional connectivity, global network characteristics, association between the coupling and network properties, and measures of rich‐club organization. Structural–functional (SC–FC) coupling was globally preserved showing a positive association in current CB users. However, the users had disrupted associations between SC–FC coupling and network topological characteristics, most perturbed for shorter connections implying region‐specific disruption by CB use. Rich‐club analysis revealed impaired SC–FC coupling in the hippocampus and caudate of users. This study provides evidence of the abnormal SC–FC association in CB users. The effect was predominant in shorter connections of the brain network, suggesting that the impact of CB use or predispositional factors may be most apparent in local interconnections. Notably, the hippocampus and caudate specifically showed aberrant structural and functional coupling. These structures have high CB1 receptor density and may also be associated with changes in learning and habit formation that occur with chronic cannabis use.

     
    more » « less
  3. Abstract Although first principles based anharmonic lattice dynamics is one of the most common methods to obtain phonon properties, such method is impractical for high-throughput search of target thermal materials. We develop an elemental spatial density neural network force field as a bottom-up approach to accurately predict atomic forces of ~80,000 cubic crystals spanning 63 elements. The primary advantage of our indirect machine learning model is the accessibility of phonon transport physics at the same level as first principles, allowing simultaneous prediction of comprehensive phonon properties from a single model. Training on 3182 first principles data and screening 77,091 unexplored structures, we identify 13,461 dynamically stable cubic structures with ultralow lattice thermal conductivity below 1 Wm −1 K −1 , among which 36 structures are validated by first principles calculations. We propose mean square displacement and bonding-antibonding as two low-cost descriptors to ease the demand of expensive first principles calculations for fast screening ultralow thermal conductivity. Our model also quantitatively reveals the correlation between off-diagonal coherence and diagonal populations and identifies the distinct crossover from particle-like to wave-like heat conduction. Our algorithm is promising for accelerating discovery of novel phononic crystals for emerging applications, such as thermoelectrics, superconductivity, and topological phonons for quantum information technology. 
    more » « less
  4. Cryo-electron microscopy (cryo-EM) is a structural technique that has played a significant role in protein structure determination in recent years. Compared to the traditional methods of X-ray crystallography and NMR spectroscopy, cryo-EM is capable of producing images of much larger protein complexes. However, cryo-EM reconstructions are limited to medium-resolution (~4–10 Å) for some cases. At this resolution range, a cryo-EM density map can hardly be used to directly determine the structure of proteins at atomic level resolutions, or even at their amino acid residue backbones. At such a resolution, only the position and orientation of secondary structure elements (SSEs) such as α-helices and β-sheets are observable. Consequently, finding the mapping of the secondary structures of the modeled structure (SSEs-A) to the cryo-EM map (SSEs-C) is one of the primary concerns in cryo-EM modeling. To address this issue, this study proposes a novel automatic computational method to identify SSEs correspondence in three-dimensional (3D) space. Initially, through a modeling of the target sequence with the aid of extracting highly reliable features from a generated 3D model and map, the SSEs matching problem is formulated as a 3D vector matching problem. Afterward, the 3D vector matching problem is transformed into a 3D graph matching problem. Finally, a similarity-based voting algorithm combined with the principle of least conflict (PLC) concept is developed to obtain the SSEs correspondence. To evaluate the accuracy of the method, a testing set of 25 experimental and simulated maps with a maximum of 65 SSEs is selected. Comparative studies are also conducted to demonstrate the superiority of the proposed method over some state-of-the-art techniques. The results demonstrate that the method is efficient, robust, and works well in the presence of errors in the predicted secondary structures of the cryo-EM images. 
    more » « less
  5. Abstract

    Structures of proteins and protein–protein complexes are determined by the same physical principles and thus share a number of similarities. At the same time, there could be differences because in order to function, proteins interact with other molecules, undergo conformations changes, and so forth, which might impose different restraints on the tertiary versus quaternary structures. This study focuses on structural properties of protein–protein interfaces in comparison with the protein core, based on the wealth of currently available structural data and new structure‐based approaches. The results showed that physicochemical characteristics, such as amino acid composition, residue–residue contact preferences, and hydrophilicity/hydrophobicity distributions, are similar in protein core and protein–protein interfaces. On the other hand, characteristics that reflect the evolutionary pressure, such as structural composition and packing, are largely different. The results provide important insight into fundamental properties of protein structure and function. At the same time, the results contribute to better understanding of the ways to dock proteins. Recent progress in predicting structures of individual proteins follows the advancement of deep learning techniques and new approaches to residue coevolution data. Protein core could potentially provide large amounts of data for application of the deep learning to docking. However, our results showed that the core motifs are significantly different from those at protein–protein interfaces, and thus may not be directly useful for docking. At the same time, such difference may help to overcome a major obstacle in application of the coevolutionary data to docking—discrimination of the intramolecular information not directly relevant to docking.

     
    more » « less