Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Identifying novel drug-target interactions is a critical and rate-limiting step in drug discovery. While deep learning models have been proposed to accelerate the identification process, here we show that state-of-the-art models fail to generalize to novel (i.e., never-before-seen) structures. We unveil the mechanisms responsible for this shortcoming, demonstrating how models rely on shortcuts that leverage the topology of the protein-ligand bipartite network, rather than learning the node features. Here we introduce AI-Bind, a pipeline that combines network-based sampling strategies with unsupervised pre-training to improve binding predictions for novel proteins and ligands. We validate AI-Bind predictions via docking simulations and comparison with recent experimental evidence, and step up the process of interpreting machine learning prediction of protein-ligand binding by identifying potential active binding sites on the amino acid sequence. AI-Bind is a high-throughput approach to identify drug-target combinations with the potential of becoming a powerful tool in drug discovery.more » « lessFree, publicly-accessible full text available December 1, 2024
-
Abstract In this work, we explore multiplex graph (networks with different types of edges) generation with deep generative models. We discuss some of the challenges associated with multiplex graph generation that make it a more difficult problem than traditional graph generation. We propose T
en GAN, the first neural network for multiplex graph generation, which greatly reduces the number of parameters required for multiplex graph generation. We also propose 3 different criteria for evaluating the quality of generated graphs: a graph-attribute-based, a classifier-based, and a tensor-based method. We evaluate its performance on 4 datasets and show that it generally performs better than other existing statistical multiplex graph generative models. We also adapt HGEN, an existing deep generative model for heterogeneous information networks, to work for multiplex graphs and show that our method generally performs better. -
Abstract The maritime shipping network is the backbone of global trade. Data about the movement of cargo through this network comes in various forms, from ship-level Automatic Identification System (AIS) data, to aggregated bilateral trade volume statistics. Multiple network representations of the shipping system can be derived from any one data source, each of which has advantages and disadvantages. In this work, we examine data in the form of liner shipping service routes, a list of walks through the port-to-port network aggregated from individual shipping companies by a large shipping logistics database. This data is inherently sequential, in that each route represents a sequence of ports called upon by a cargo ship. Previous work has analyzed this data without taking full advantage of the sequential information. Our contribution is to develop a path-based methodology for analyzing liner shipping service route data, computing navigational trajectories through the network that both respect the directional information in the shipping routes and minimize the number of cargo transfers between routes, a desirable property in industry practice. We compare these paths with those computed using other network representations of the same data, finding that our approach results in paths that are longer in terms of both network and nautical distance. We further use these trajectories to re-analyze the role of a previously-identified structural core through the network, as well as to define and analyze a measure of betweenness centrality for nodes and edges.
-
null (Ed.)The problem of diffusion control on networks has been extensively studied, with applications ranging from marketing to controlling infectious disease. However, in many applications, such as cybersecurity, an attacker may want to attack a targeted subgraph of a network, while limiting the impact on the rest of the network in order to remain undetected. We present a model POTION in which the principal aim is to optimize graph structure to achieve such targeted attacks. We propose an algorithm POTION-ALG for solving the model at scale, using a gradient-based approach that leverages Rayleigh quotients and pseudospectrum theory. In addition, we present a condition for certifying that a targeted subgraph is immune to such attacks. Finally, we demonstrate the effectiveness of our approach through experiments on real and synthetic networks.more » « less
-
null (Ed.)Recently, coordinated attack campaigns started to become more widespread on the Internet. In May 2017, WannaCry infected more than 300,000 machines in 150 countries in a few days and had a large impact on critical infrastructure. Existing threat sharing platforms cannot easily adapt to emerging attack patterns. At the same time, enterprises started to adopt machine learning-based threat detection tools in their local networks. In this paper, we pose the question: What information can defenders share across multiple networks to help machine learning-based threat detection adapt to new coordinated attacks? We propose three information sharing methods across two networks, and show how the shared information can be used in a machine learning network-traffic model to significantly improve its ability of detecting evasive self-propagating malware.more » « less