skip to main content


Title: Finding shortest and nearly shortest path nodes in large substantially incomplete networks by hyperbolic mapping
Abstract

Dynamic processes on networks, be it information transfer in the Internet, contagious spreading in a social network, or neural signaling, take place along shortest or nearly shortest paths. Computing shortest paths is a straightforward task when the network of interest is fully known, and there are a plethora of computational algorithms for this purpose. Unfortunately, our maps of most large networks are substantially incomplete due to either the highly dynamic nature of networks, or high cost of network measurements, or both, rendering traditional path finding methods inefficient. We find that shortest paths in large real networks, such as the network of protein-protein interactions and the Internet at the autonomous system level, are not random but are organized according to latent-geometric rules. If nodes of these networks are mapped to points in latent hyperbolic spaces, shortest paths in them align along geodesic curves connecting endpoint nodes. We find that this alignment is sufficiently strong to allow for the identification of shortest path nodes even in the case of substantially incomplete networks, where numbers of missing links exceed those of observable links. We demonstrate the utility of latent-geometric path finding in problems of cellular pathway reconstruction and communication security.

 
more » « less
Award ID(s):
1741355
NSF-PAR ID:
10391735
Author(s) / Creator(s):
; ; ; ; ; ; ;
Publisher / Repository:
Nature Publishing Group
Date Published:
Journal Name:
Nature Communications
Volume:
14
Issue:
1
ISSN:
2041-1723
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Skyline path queries (SPQs) extend skyline queries to multi-dimensional networks, such as multi-cost road networks (MCRNs). Such queries return a set of non-dominated paths between two given network nodes. Despite the existence of extensive works on evaluating different SPQ variants, SPQ evaluation is still very inefficient due to the nonexistence of efficient index structures to support such queries. Existing index building approaches for supporting shortest-path query execution, when directly extended to support SPQs, use an unreasonable amount of space and time to build, making them impractical for processing large graphs. In this paper, we propose a novel index structure,backbone index, and a corresponding index construction method that condenses an initial MCRN to multiple smaller summarized graphs with different granularity. We present efficient approaches to find approximate solutions to SPQs by utilizing the backbone index structure. Furthermore, considering making good use of historical query and query results, we propose two models,SkylinePathGraphNeuralNetwork (SP-GNN) andTransfer SP-GNN (TSP-GNN), to support effective SPQ processing. Our extensive experiments on real-world large road networks show that the backbone index can support finding meaningful approximate SPQ solutions efficiently. The backbone index can be constructed in a reasonable time, which dramatically outperforms the construction of other types of indexes for road networks. As far as we know, this is the first compact index structure that can support efficient approximate SPQ evaluation on large MCRNs. The results on the SP-GNN and TSP-GNN models also show that both models can help get approximate SPQ answers efficiently.

     
    more » « less
  2. null (Ed.)
    Suboptimal path analysis in a protein structural or dynamical network becomes increasingly popular for identifying critical residues involved in allosteric communication and regulation. Several software packages have been developed for calculating suboptimal paths, including NetworkView, WISP, and CNAPATH (Bio3D). Although these packages work well for biological systems of moderate sizes, they either dramatically slow down or are subjected to accuracy issues when applied to large systems such as supramolecular complexes. In this work, we develop a new method called SOAN, which implements a modified version of Yen’s algorithm for finding loopless k-shortest paths. Instead of searching the entire protein network, SOAN builds up a subgraph for path calculations based on an initial evaluation of the optimal path and its neighbouring nodes. We test our method on four systems of increasing size and compare it to the NetworkView, WISP and CNAPATH methods. The result shows that SOAN is approximately five times faster than NetworkView and orders of magnitude faster than CNAPATH and WISP. In terms of accuracy, SOAN is comparable to CNAPATH and WISP and superior to NetworkView. We also discuss the influence of SOAN input parameters on performance and suggest optimal values. 
    more » « less
  3. With the advent of Network Function Virtualization (NFV), Physical Network Functions (PNFs) are gradually being replaced by Virtual Network Functions (VNFs) that are hosted on general purpose servers. Depending on the call flows for specific services, the packets need to pass through an ordered set of network functions (physical or virtual) called Service Function Chains (SFC) before reaching the destination. Conceivably for the next few years during this transition, these networks would have a mix of PNFs and VNFs, which brings an interesting mix of network problems that are studied in this paper: (1) How to find an SFC-constrained shortest path between any pair of nodes? (2) What is the achievable SFC-constrained maximum flow? (3) How to place the VNFs such that the cost (the number of nodes to be virtualized) is minimized, while the maximum flow of the original network can still be achieved even under the SFC constraint? In this work, we will try to address such emerging questions. First, for the SFC-constrained shortest path problem, we propose a transformation of the network graph to minimize the computational complexity of subsequent applications of any shortest path algorithm. Second, we formulate the SFC-constrained maximum flow problem as a fractional multicommodity flow problem, and develop a combinatorial algorithm for a special case of practical interest. Third, we prove that the VNFs placement problem is NP-hard and present an alternative Integer Linear Programming (ILP) formulation. Finally, we conduct simulations to elucidate our theoretical results. 
    more » « less
  4. Virtual network services that span multiple data centers are important to support emerging data-intensive applications in fields such as bioinformatics and retail analytics. Successful virtual network service composition and maintenance requires flexible and scalable ‘constrained shortest path management’ both in the management plane for virtual network embedding (VNE) or network function virtualization service chaining (NFV-SC), as well as in the data plane for traffic engineering (TE). In this paper, we show analytically and empirically that leveraging constrained shortest paths within recent VNE, NFV-SC and TE algorithms can lead to network utilization gains (of up to 50%) and higher energy efficiency. The management of complex VNE, NFV-SC and TE algorithms can be, however, intractable for large scale substrate networks due to the NP-hardness of the constrained shortest path problem. To address such scalability challenges, we propose a novel, exact constrained shortest path algorithm viz.‘, Neighborhoods Method’ (NM). Our NM uses novel search space reduction techniques and has a theoretical quadratic speed-up making it practically faster (by an order of magnitude) than recent branch-and-bound exhaustive search solutions. Finally, we detail our NM-based SDN controller implementation in a real-world testbed to further validate practical NM benefits for virtual network services. 
    more » « less
  5. null (Ed.)
    Solving the Multi-Agent Path Finding (MAPF) problem optimally is known to be NP-Hard for both make-span and total arrival time minimization. While many algorithms have been developed to solve MAPF problems, there is no dominating optimal MAPF algorithm that works well in all types of problems and no standard guidelines for when to use which algorithm. In this work, we develop the deep convolutional network MAPFAST (Multi-Agent Path Finding Algorithm SelecTor), which takes a MAPF problem instance and attempts to select the fastest algorithm to use from a portfolio of algorithms. We improve the performance of our model by including single-agent shortest paths in the instance embedding given to our model and by utilizing supplemental loss functions in addition to a classification loss. We evaluate our model on a large and di- verse dataset of MAPF instances, showing that it outperforms all individual algorithms in its portfolio as well as the state-of-the-art optimal MAPF algorithm selector. We also provide an analysis of algorithm behavior in our dataset to gain a deeper understanding of optimal MAPF algorithms’ strengths and weaknesses to help other researchers leverage different heuristics in algorithm designs. 
    more » « less