skip to main content


Title: Could network structures generated with simple rules imposed on a cubic lattice reproduce the structural descriptors of globular proteins?
Abstract A direct way to spot structural features that are universally shared among proteins is to find analogues from simpler condensed matter systems. In the current study, the feasibility of creating ensembles of artificial structures that can automatically reproduce a large number of geometrical and topological descriptors of globular proteins is investigated. Towards this aim, a simple cubic (SC) arrangement is shown to provide the best background lattice after a careful analysis of the residue packing trends from 210 globular proteins. It is shown that a minimalistic set of rules imposed on this lattice is sufficient to generate structures that can mimic real proteins. In the proposed method, 210 such structures are generated by randomly removing residues (beads) from clusters that have a SC lattice arrangement such that all the generated structures have single connected components. Two additional sets are prepared from the initial structures via random relaxation and a reverse Monte Carlo simulated annealing algorithm, which targets the average radial distribution function (RDF) of 210 globular proteins. The initial and relaxed structures are compared to real proteins via RDF, bond orientational order parameters and several descriptors of network topology. Based on these features, results indicate that the structures generated with 40% occupancy closely resemble real residue networks. The structure generation mechanism automatically produces networks that are in the same topological class as globular proteins and reproduce small-world characteristics of high clustering and small shortest path lengths. Most notably, the established correspondence rules out icosahedral order as a relevant structural feature for residue networks in contrast to other amorphous systems where it is an inherent characteristic. The close correspondence is also observed in the vibrational characteristics as computed from the Anisotropic Network Model, therefore hinting at a non-superficial link between the proteins and the defect laden cubic crystalline order.  more » « less
Award ID(s):
1825254
NSF-PAR ID:
10312846
Author(s) / Creator(s):
; ; ; ;
Editor(s):
Estrada, Ernesto
Date Published:
Journal Name:
Journal of Complex Networks
Volume:
10
Issue:
1
ISSN:
2051-1310
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Complex biological, neuroscience, geoscience and social networks exhibit heterogeneous self-similar higher order topological structures that are usually characterized as being multifractal in nature. However, describing their topological complexity through a compact mathematical description and deciphering their topological governing rules has remained elusive and prevented a comprehensive understanding of networks. To overcome this challenge, we propose a weighted multifractal graph model capable of capturing the underlying generating rules of complex systems and characterizing their node heterogeneity and pairwise interactions. To infer the generating measure with hidden information, we introduce a variational expectation maximization framework. We demonstrate the robustness of the network generator reconstruction as a function of model properties, especially in noisy and partially observed scenarios. The proposed network generator inference framework is able to reproduce network properties, differentiate varying structures in brain networks and chromosomal interactions, and detect topologically associating domain regions in conformation maps of the human genome. 
    more » « less
  2. Abstract Although first principles based anharmonic lattice dynamics is one of the most common methods to obtain phonon properties, such method is impractical for high-throughput search of target thermal materials. We develop an elemental spatial density neural network force field as a bottom-up approach to accurately predict atomic forces of ~80,000 cubic crystals spanning 63 elements. The primary advantage of our indirect machine learning model is the accessibility of phonon transport physics at the same level as first principles, allowing simultaneous prediction of comprehensive phonon properties from a single model. Training on 3182 first principles data and screening 77,091 unexplored structures, we identify 13,461 dynamically stable cubic structures with ultralow lattice thermal conductivity below 1 Wm −1 K −1 , among which 36 structures are validated by first principles calculations. We propose mean square displacement and bonding-antibonding as two low-cost descriptors to ease the demand of expensive first principles calculations for fast screening ultralow thermal conductivity. Our model also quantitatively reveals the correlation between off-diagonal coherence and diagonal populations and identifies the distinct crossover from particle-like to wave-like heat conduction. Our algorithm is promising for accelerating discovery of novel phononic crystals for emerging applications, such as thermoelectrics, superconductivity, and topological phonons for quantum information technology. 
    more » « less
  3. Machine learning at the extreme edge has enabled a plethora of intelligent, time-critical, and remote applications. However, deploying interpretable artificial intelligence systems that can perform high-level symbolic reasoning and satisfy the underlying system rules and physics within the tight platform resource constraints is challenging. In this paper, we introduceTinyNS, the first platform-aware neurosymbolic architecture search framework for joint optimization of symbolic and neural operators.TinyNSprovides recipes and parsers to automatically write microcontroller code for five types of neurosymbolic models, combining the context awareness and integrity of symbolic techniques with the robustness and performance of machine learning models.TinyNSuses a fast, gradient-free, black-box Bayesian optimizer over discontinuous, conditional, numeric, and categorical search spaces to find the best synergy of symbolic code and neural networks within the hardware resource budget. To guarantee deployability,TinyNStalks to the target hardware during the optimization process. We showcase the utility ofTinyNSby deploying microcontroller-class neurosymbolic models through several case studies. In all use cases,TinyNSoutperforms purely neural or purely symbolic approaches while guaranteeing execution on real hardware.

     
    more » « less
  4. Abstract

    The recent breakthroughs in structure prediction, where methods such as AlphaFold demonstrated near‐atomic accuracy, herald a paradigm shift in structural biology. The 200 million high‐accuracy models released in the AlphaFold Database are expected to guide protein science in the coming decades. Partitioning these AlphaFold models into domains and assigning them to an evolutionary hierarchy provide an efficient way to gain functional insights into proteins. However, classifying such a large number of predicted structures challenges the infrastructure of current structure classifications, including our Evolutionary Classification of protein Domains (ECOD). Better computational tools are urgently needed to parse and classify domains from AlphaFold models automatically. Here we present a Domain Parser for AlphaFold Models (DPAM) that can automatically recognize globular domains from these models based on inter‐residue distances in 3D structures, predicted aligned errors, and ECOD domains found by sequence (HHsuite) and structural (Dali) similarity searches. Based on a benchmark of 18,759 AlphaFold models, we demonstrate that DPAM can recognize 98.8% of domains and assign correct boundaries for 87.5%, significantly outperforming structure‐based domain parsers and homology‐based domain assignment using ECOD domains found by HHsuite or Dali. Application of DPAM to the massive AlphaFold models will enable efficient classification of domains, providing evolutionary contexts and facilitating functional studies.

     
    more » « less
  5. Bellwied, R ; Geurts, F ; Rapp, R ; Ratti, C ; Timmins, A ; Vitev, I (Ed.)
    We employ an Einstein-Maxwell-dilaton model, based on the gauge/gravity correspondence, to obtain the thermodynamics and transport properties for the hot and dense quark-gluon plasma. The model, which is constrained to reproduce lattice QCD thermodynamics at zero density, predicts a critical point and a first order line at finite temperature and density, is used to quantify jet energy loss through simulations of high-energy collision events. 
    more » « less