skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Embeddings and labeling schemes for A*
A* is a classic and popular method for graphs search and path finding. It assumes the existence of a heuristic function h(u,t) that estimates the shortest distance from any input node u to the destination t. Traditionally, heuristics have been handcrafted by domain experts. However, over the last few years, there has been a growing interest in learning heuristic functions. Such learned heuristics estimate the distance between given nodes based on "features" of those nodes. In this paper we formalize and initiate the study of such feature-based heuristics. In particular, we consider heuristics induced by norm embeddings and distance labeling schemes, and provide lower bounds for the tradeoffs between the number of dimensions or bits used to represent each graph node, and the running time of the A* algorithm. We also show that, under natural assumptions, our lower bounds are almost optimal.  more » « less
Award ID(s):
2022448
PAR ID:
10341844
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Innovations in Theoretical Computer Science (ITCS)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Chan, Timothy; Fischer, Johannes; Iacono, John; Herman, Grzegorz (Ed.)
    We study fundamental directed graph (digraph) problems in the streaming model. An initial investigation by Chakrabarti, Ghosh, McGregor, and Vorotnikova [SODA'20] on streaming digraphs showed that while most of these problems are provably hard in general, some of them become tractable when restricted to the well-studied class of tournament graphs where every pair of nodes shares exactly one directed edge. Thus, we focus on tournaments and improve the state of the art for multiple problems in terms of both upper and lower bounds. Our primary upper bound is a deterministic single-pass semi-streaming algorithm (using Õ(n) space for n-node graphs, where Õ(.) hides polylog(n) factors) for decomposing a tournament into strongly connected components (SCC). It improves upon the previously best-known algorithm by Baweja, Jia, and Woodruff [ITCS'22] in terms of both space and passes: for p ⩾ 1, they used (p+1) passes and Õ(n^{1+1/p}) space. We further extend our algorithm to digraphs that are close to tournaments and establish tight bounds demonstrating that the problem’s complexity grows smoothly with the "distance" from tournaments. Applying our SCC-decomposition framework, we obtain improved - and in some cases, optimal - tournament algorithms for s,t-reachability, strong connectivity, Hamiltonian paths and cycles, and feedback arc set. On the other hand, we prove lower bounds exhibiting that some well-studied problems - such as (exact) feedback arc set and s,t-distance - remain hard (require Ω(n²) space) on tournaments. Moreover, we generalize the former problem’s lower bound to establish space-approximation tradeoffs: any single-pass (1± ε)-approximation algorithm requires Ω(n/√{ε}) space. Finally, we settle the streaming complexities of two basic digraph problems studied by prior work: acyclicity testing of tournaments and sink finding in DAGs. As a whole, our collection of results contributes significantly to the growing literature on streaming digraphs. 
    more » « less
  2. Gortz, Inge Li; Farach-Colton, Martin; Puglisi, Simon J.; Herman, Grzegorz (Ed.)
    Computing the diameter of a graph, i.e. the largest distance, is a fundamental problem that is central in fine-grained complexity. In undirected graphs, the Strong Exponential Time Hypothesis (SETH) yields a lower bound on the time vs. approximation trade-off that is quite close to the upper bounds. In directed graphs, however, where only some of the upper bounds apply, much larger gaps remain. Since d(u,v) may not be the same as d(v,u), there are multiple ways to define the problem, the two most natural being the (one-way) diameter (max_(u,v) d(u,v)) and the roundtrip diameter (max_{u,v} d(u,v)+d(v,u)). In this paper we make progress on the outstanding open question for each of them. - We design the first algorithm for diameter in sparse directed graphs to achieve n^{1.5-ε} time with an approximation factor better than 2. The new upper bound trade-off makes the directed case appear more similar to the undirected case. Notably, this is the first algorithm for diameter in sparse graphs that benefits from fast matrix multiplication. - We design new hardness reductions separating roundtrip diameter from directed and undirected diameter. In particular, a 1.5-approximation in subquadratic time would refute the All-Nodes k-Cycle hypothesis, and any (2-ε)-approximation would imply a breakthrough algorithm for approximate 𝓁_∞-Closest-Pair. Notably, these are the first conditional lower bounds for diameter that are not based on SETH. 
    more » « less
  3. Graph neural networks (GNNs) have shown great potential in learning on graphs, but they are known to perform sub-optimally on link prediction tasks. Existing GNNs are primarily designed to learn node-wise representations and usually fail to capture pairwise relations between target nodes, which proves to be crucial for link prediction. Recent works resort to learning more expressive edge-wise representations by enhancing vanilla GNNs with structural features such as labeling tricks and link prediction heuristics, but they suffer from high computational overhead and limited scalability. To tackle this issue, we propose to learn structural link representations by augmenting the message-passing framework of GNNs with Bloom signatures. Bloom signatures are hashing-based compact encodings of node neighborhoods, which can be efficiently merged to recover various types of edge-wise structural features. We further show that any type of neighborhood overlap-based heuristic can be estimated by a neural network that takes Bloom signatures as input. GNNs with Bloom signatures are provably more expressive than vanilla GNNs and also more scalable than existing edge-wise models. Experimental results on five standard link prediction benchmarks show that our proposed model achieves comparable or better performance than existing edge-wise GNN models while being 3-200 × faster and more memory-efficient for online inference. 
    more » « less
  4. Gørtz, Inge Li; Farach-Colton, Martin; Puglisi, Simon J; Herman, Grzegorz (Ed.)
    Computing the diameter of a graph, i.e. the largest distance, is a fundamental problem that is central in fine-grained complexity. In undirected graphs, the Strong Exponential Time Hypothesis (SETH) yields a lower bound on the time vs. approximation trade-off that is quite close to the upper bounds. In directed graphs, however, where only some of the upper bounds apply, much larger gaps remain. Since d(u,v) may not be the same as d(v,u), there are multiple ways to define the problem, the two most natural being the (one-way) diameter (max_(u,v) d(u,v)) and the roundtrip diameter (max_{u,v} d(u,v)+d(v,u)). In this paper we make progress on the outstanding open question for each of them. - We design the first algorithm for diameter in sparse directed graphs to achieve n^{1.5-ε} time with an approximation factor better than 2. The new upper bound trade-off makes the directed case appear more similar to the undirected case. Notably, this is the first algorithm for diameter in sparse graphs that benefits from fast matrix multiplication. - We design new hardness reductions separating roundtrip diameter from directed and undirected diameter. In particular, a 1.5-approximation in subquadratic time would refute the All-Nodes k-Cycle hypothesis, and any (2-ε)-approximation would imply a breakthrough algorithm for approximate 𝓁_∞-Closest-Pair. Notably, these are the first conditional lower bounds for diameter that are not based on SETH. 
    more » « less
  5. Selecting appropriate inputs for systems described by complex networks is an important but difficult problem that largely remains open in the field of control of networks. Recent work has proposed two methods for energy efficient input selection; a gradient-based heuristic and a greedy approximation algorithm. We propose here an alternative method for input selection based on the analytic solution of the controllability Gramian of the ‘balloon graph’, a special model graph that captures the role of both distance and redundant paths between a driver node and a target node. The method presented is especially applicable for large networks where one is interested in controlling only a small number of outputs, or target nodes, for which current methods may not be practical because they require computing a typically very ill-conditioned matrix, called the controllability Gramian. Our method produces comparable results to the previous methods while being more computational efficient. 
    more » « less