Abstract Estimating the accuracy of protein structural models is a critical task in protein bioinformatics. The need for robust methods in the estimation of protein model accuracy (EMA) is prevalent in the field of protein structure prediction, where computationally‐predicted structures need to be screened rapidly for the reliability of the positions predicted for each of their amino acid residues and their overall quality. Current methods proposed for EMA are either coupled tightly to existing protein structure prediction methods or evaluate protein structures without sufficiently leveraging the rich, geometric information available in such structures to guide accuracy estimation. In this work, we propose a geometric message passing neural network referred to as the geometry‐complete perceptron network for protein structure EMA (GCPNet‐EMA), where we demonstrate through rigorous computational benchmarks that GCPNet‐EMA's accuracy estimations are 47% faster and more than 10% (6%) more correlated with ground‐truth measures of per‐residue (per‐target) structural accuracy compared to baseline state‐of‐the‐art methods for tertiary (multimer) structure EMA including AlphaFold 2. The source code and data for GCPNet‐EMA are available on GitHub, and a public web server implementation is freely available.
more »
« less
Reconstructing Sparse Multiplex Networks with Application to Covert Networks
Network structure provides critical information for understanding the dynamic behavior of complex systems. However, the complete structure of real-world networks is often unavailable, thus it is crucially important to develop approaches to infer a more complete structure of networks. In this paper, we integrate the configuration model for generating random networks into an Expectation–Maximization–Aggregation (EMA) framework to reconstruct the complete structure of multiplex networks. We validate the proposed EMA framework against the Expectation–Maximization (EM) framework and random model on several real-world multiplex networks, including both covert and overt ones. It is found that the EMA framework generally achieves the best predictive accuracy compared to the EM framework and the random model. As the number of layers increases, the performance improvement of EMA over EM decreases. The inferred multiplex networks can be leveraged to inform the decision-making on monitoring covert networks as well as allocating limited resources for collecting additional information to improve reconstruction accuracy. For law enforcement agencies, the inferred complete network structure can be used to develop more effective strategies for covert network interdiction.
more »
« less
- Award ID(s):
- 2047488
- PAR ID:
- 10405437
- Date Published:
- Journal Name:
- Entropy
- Volume:
- 25
- Issue:
- 1
- ISSN:
- 1099-4300
- Page Range / eLocation ID:
- 142
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Summary Latent space models are frequently used for modelling single-layer networks and include many popular special cases, such as the stochastic block model and the random dot product graph. However, they are not well developed for more complex network structures, which are becoming increasingly common in practice. In this article we propose a new latent space model for multiplex networks, i.e., multiple heterogeneous networks observed on a shared node set. Multiplex networks can represent a network sample with shared node labels, a network evolving over time, or a network with multiple types of edges. The key feature of the proposed model is that it learns from data how much of the network structure is shared between layers and pools information across layers as appropriate. We establish identifiability, develop a fitting procedure using convex optimization in combination with a nuclear-norm penalty, and prove a guarantee of recovery for the latent positions provided there is sufficient separation between the shared and the individual latent subspaces. We compare the model with competing methods in the literature on simulated networks and on a multiplex network describing the worldwide trade of agricultural products.more » « less
-
null (Ed.)Learning low-dimensional representations of graphs has facilitated the use of traditional machine learning techniques to solving classic network analysis tasks such as link prediction, node classification, community detection, etc. However, to date, the vast majority of these learning tasks are focused on traditional single-layer/unimodal networks and largely ignore the case of multiplex networks. A multiplex network is a suitable structure to model multi-dimensional real-world complex systems. It consists of multiple layers where each layer represents a different relationship among the network nodes. In this work, we propose MUNEM, a novel approach for learning a low-dimensional representation of a multiplex network using a triplet loss objective function. In our approach, we preserve the global structure of each layer, while at the same time fusing knowledge among different layers during the learning process. We evaluate the effectiveness of our proposed method by testing and comparing on real-world multiplex networks from different domains, such as collaboration network, protein-protein interaction network, online social network. Finally, in order to deliberately examine the effect of our model’s parameters we conduct extensive experiments on synthetic multiplex networks.more » « less
-
null (Ed.)Multiplex networks are complex graph structures in which a set of entities are connected to each other via multiple types of relations, each relation representing a distinct layer. Such graphs are used to investigate many complex biological, social, and technological systems. In this work, we present a novel semi-supervised approach for structure-aware representation learning on multiplex networks. Our approach relies on maximizing the mutual information between local node-wise patch representations and label correlated structure-aware global graph representations to model the nodes and cluster structures jointly. Specifically, it leverages a novel cluster-aware, node-contextualized global graph summary generation strategy for effective joint-modeling of node and cluster representations across the layers of a multiplex network. Empirically, we demonstrate that the proposed architecture outperforms state-of-the-art methods in a range of tasks: classification, clustering, visualization, and similarity search on seven real-world multiplex networks for various experiment settings.more » « less
-
Multilayer networks describe the rich ways in which nodes are related by accounting for different relationships in separate layers. These multiple relationships are naturally represented by an adjacency tensor. In this work we study the use of the nonnegative Tucker decomposition (NNTuck) of such tensors under a KL loss as an expressive factor model that naturally generalizes existing stochastic block models of multilayer networks. Quantifying interdependencies between layers can identify redundancies in the structure of a network, indicate relationships between disparate layers, and potentially inform survey instruments for collecting social network data. We propose definitions of layer independence, dependence, and redundancy based on likelihood ratio tests between nested nonnegative Tucker decompositions. Using both synthetic and real-world data, we evaluate the use and interpretation of the NNTuck as a model of multilayer networks. Algorithmically, we show that using expectation maximization (EM) to maximize the log-likelihood under the NNTuck is step-by-step equivalent to tensorial multiplicative updates for the NNTuck under a KL loss, extending a previously known equivalence from nonnegative matrices to nonnegative tensors.more » « less
An official website of the United States government

