NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Approximately counting and sampling Hamiltonian motifs in sublinear time

Levi, Reut; Eden, Talya; Ron, Dana; Rubinfeld, Ronitt (June 2025, 57th Annual ACM Symposium on Theory of Computing (STOC 2025))

Counting small subgraphs, referred to as motifs, in large graphs is a fundamental task in graph analysis, extensively studied across various contexts and computational models. In the sublinear-time regime, the relaxed problem of approximate counting has been explored within two prominent query frameworks: the standard model, which permits degree, neighbor, and pair queries, and the strictly more powerful augmented model, which additionally allows for uniform edge sampling. Currently, in the standard model, (opti- mal) results have been established only for approximately counting edges, stars, and cliques, all of which have a radius of one. This contrasts sharply with the state of affairs in the augmented model, where algorithmic results (some of which are optimal) are known for any input motif, leading to a disparity which we term the “scope gap" between the two models. In this work, we make significant progress in bridging this gap. Our approach draws inspiration from recent advancements in the augmented model and utilizes a framework centered on counting by uniform sampling, thus allowing us to establish new results in the standard model and simplify on previous results. In particular, our first, and main, contribution is a new algorithm in the standard model for approximately counting any Hamiltonian motif in sublinear time, where the complexity of the algorithm is the sum of two terms. One term equals the complexity of the known algorithms by Assadi, Kapralov, and Khanna (ITCS 2019) and Fichtenberger and Peng (ICALP 2020) in the (strictly stronger) augmented model and the other is an additional, necessary, additive overhead. Our second contribution is a variant of our algorithm that en- ables nearly uniform sampling of these motifs, a capability pre- viously limited in the standard model to edges and cliques. Our third contribution is to introduce even simpler algorithms for stars and cliques by exploiting their radius-one property. As a result, we simplify all previously known algorithms in the standard model for stars (Gonen, Ron, Shavitt (SODA 2010)), triangles (Eden, Levi, Ron Seshadhri (FOCS 2015)) and cliques (Eden, Ron, Seshadri (STOC 2018)).
more » « less
Free, publicly-accessible full text available June 23, 2026
A Sublinear Algorithm for Approximate Shortest Paths in Large Networks

https://doi.org/10.1145/3701551.3703512

Basu, Sabyasachi; Kōshima, Nadia; Eden, Talya; Ben-Eliezer, Omri; Seshadhri, C (March 2025, ACM)

Free, publicly-accessible full text available March 10, 2026
Plant-and-Steal: Truthful Fair Allocations via Predictions

Cohen, Ilan Reuven; Eden, Alon; Eden, Talya; Vasilyan, Arsen (December 2024, Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems (NeurIPS 2024))

We study truthful mechanisms for approximating the Maximin-Share (MMS) allocation of agents with additive valuations for indivisible goods. Algorithmically, constant factor approximations exist for the problem for any number of agents. When adding incentives to the mix, a jarring result by Amanatidis, Birmpas, Christodoulou, and Markakis [EC 2017] shows that the best possible approximation for two agents and m items is ⌊m2⌋. We adopt a learning-augmented framework to investigate what is possible when some prediction on the input is given. For two agents, we give a truthful mechanism that takes agents' ordering over items as prediction. When the prediction is accurate, we give a 2-approximation to the MMS (consistency), and when the prediction is off, we still get an ⌈m2⌉-approximation to the MMS (robustness). We further show that the mechanism's performance degrades gracefully in the number of mistakes" in the prediction; i.e., we interpolate (up to constant factors) between the two extremes: when there are no mistakes, and when there is a maximum number of mistakes. We also show an impossibility result on the obtainable consistency for mechanisms with finite robustness. For the general case of n≥2 agents, we give a 2-approximation mechanism for accurate predictions, with relaxed fallback guarantees. Finally, we give experimental results which illustrate when different components of our framework, made to insure consistency and robustness, come into play.
more » « less
Full Text Available
Sampling an Edge in Sublinear Time Exactly and Optimally

https://doi.org/10.1137/1.9781611977585.ch23

Eden, Talya; Narayanan, Shyam; Tetek, Jakub (January 2023, Symposium on Simplicity in Algorithms)

Sampling edges from a graph in sublinear time is a fundamental problem and a powerful subroutine for designing sublinear-time algorithms. Suppose we have access to the vertices of the graph and know a constant-factor approximation to the number of edges. An algorithm for pointwise ε-approximate edge sampling with complexity has been given by Eden and Rosenbaum [SOSA 2018]. This has been later improved by Tetek and Thorup [STOC 2022] to . At the same time, time is necessary. We close the problem, by giving an algorithm with complexity for the task of sampling an edge exactly uniformly.
more » « less
Full Text Available
Triangle Counting with Local Edge Differential Privacy

https://doi.org/10.4230/LIPICS.ICALP.2023.52

Eden, Talya; Liu, Quanquan C.; Raskhodnikova, Sofya; Smith, Adam (January 2023, Leibniz International Proceedings in Informatics (LIPIcs):50th International Colloquium on Automata, Languages, and Programming (ICALP 2023))
Etessami, Kousha; Feige, Uriel; Puppis, Gabriele (Ed.)
Many deployments of differential privacy in industry are in the local model, where each party releases its private information via a differentially private randomizer. We study triangle counting in the noninteractive and interactive local model with edge differential privacy (that, intuitively, requires that the outputs of the algorithm on graphs that differ in one edge be indistinguishable). In this model, each party’s local view consists of the adjacency list of one vertex. In the noninteractive model, we prove that additive Ω(n²) error is necessary, where n is the number of nodes. This lower bound is our main technical contribution. It uses a reconstruction attack with a new class of linear queries and a novel mix-and-match strategy of running the local randomizers with different completions of their adjacency lists. It matches the additive error of the algorithm based on Randomized Response, proposed by Imola, Murakami and Chaudhuri (USENIX2021) and analyzed by Imola, Murakami and Chaudhuri (CCS2022) for constant ε. We use a different postprocessing of Randomized Response and provide tight bounds on the variance of the resulting algorithm. In the interactive setting, we prove a lower bound of Ω(n^{3/2}) on the additive error. Previously, no hardness results were known for interactive, edge-private algorithms in the local model, except for those that follow trivially from the results for the central model. Our work significantly improves on the state of the art in differentially private graph analysis in the local model.
more » « less
Triangle Counting with Local Edge Differential Privacy

Eden, Talya; Liu, Quanquan C.; Raskhodnikova, Sofya; Smith, Adam D. (January 2023, International Colloquium on Automata, Languages, and Programming)

Many deployments of differential privacy in industry are in the local model, where each party releases its private information via a differentially private randomizer. We study triangle counting in the noninteractive and interactive local model with edge differential privacy (that, intuitively, requires that the outputs of the algorithm on graphs that differ in one edge be indistinguishable). In this model, each party’s local view consists of the adjacency list of one vertex. In the noninteractive model, we prove that additive Ω(n^2) error is necessary, where n is the number of nodes. This lower bound is our main technical contribution. It uses a reconstruction attack with a new class of linear queries and a novel mix-and-match strategy of running the local randomizers with different completions of their adjacency lists. It matches the additive error of the algorithm based on Randomized Response, proposed by Imola, Murakami and Chaudhuri (USENIX2021) and analyzed by Imola, Murakami and Chaudhuri (CCS2022) for constant ε. We use a different postprocessing of Randomized Response and provide tight bounds on the variance of the resulting algorithm. In the interactive setting, we prove a lower bound of Ω(n3/2) on the additive error. Previously, no hardness results were known for interactive, edge-private algorithms in the local model, except for those that follow trivially from the results for the central model. Our work significantly improves on the state of the art in differentially private graph analysis in the local model.
more » « less
Full Text Available
Massively Parallel Algorithms for Small Subgraph Counting

Biswas, Amartya Shankha; Eden, Talya; Liu, Quanquan C.; Rubinfeld, Ronitt Slobodan (January 2022, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2022))

Over the last two decades, frameworks for distributed-memory parallel computation, such as MapReduce, Hadoop, Spark and Dryad, have gained significant popularity with the growing prevalence of large network datasets. The Massively Parallel Computation (MPC) model is the de-facto standard for studying graph algorithms in these frameworks theoretically. Subgraph counting is one such fundamental problem in analyzing massive graphs, with the main algorithmic challenges centering on designing methods which are both scalable and accurate. Given a graph G = (V, E) with n vertices, m edges and T triangles, our first result is an algorithm that outputs a (1+ε)-approximation to T, with asymptotically optimal round and total space complexity provided any S ≥ max{(√ m, n²/m)} space per machine and assuming T = Ω(√{m/n}). Our result gives a quadratic improvement on the bound on T over previous works. We also provide a simple extension of our result to counting any subgraph of k size for constant k ≥ 1. Our second result is an O_δ(log log n)-round algorithm for exactly counting the number of triangles, whose total space usage is parametrized by the arboricity α of the input graph. We extend this result to exactly counting k-cliques for any constant k. Finally, we prove that a recent result of Bera, Pashanasangi and Seshadhri (ITCS 2020) for exactly counting all subgraphs of size at most 5 can be implemented in the MPC model in Õ_δ(√{log n}) rounds, O(n^δ) space per machine and O(mα³) total space. In addition to our theoretical results, we simulate our triangle counting algorithms in real-world graphs obtained from the Stanford Network Analysis Project (SNAP) database. Our results show that both our approximate and exact counting algorithms exhibit improvements in terms of round complexity and approximation ratio, respectively, compared to two previous widely used algorithms for these problems.
more » « less
Full Text Available
Triangle and Four Cycle Counting with Predictions in Graph Streams

Chen, Justin Y; Eden, Talya; Indyk, Piotr; Lin, Honghao; Narayanan, Shyam; Rubinfeld, Ronitt; Silwal, Sandeep; Wagner, Tal; Woodruff, David P.; Zhang, Michael (January 2022, International Conference on Learning Representations)

Full Text Available
Learning-based Support Estimation in Sublinear Time

Eden, Talya; Indyk, Piotr; Narayanan, Shyam; Rubinfeld, Ronitt; Silwal, Sandeep; Wagner, Tal (January 2021, International Conference on Learning Representations)

We consider the problem of estimating the number of distinct elements in a large data set (or, equivalently, the support size of the distribution induced by the data set) from a random sample of its elements. The problem occurs in many applications, including biology, genomics, computer systems and linguistics. A line of research spanning the last decade resulted in algorithms that estimate the support up to ±εn from a sample of size O(log2(1/ε)·n/logn), where n is the data set size. Unfortunately, this bound is known to be tight, limiting further improvements to the complexity of this problem. In this paper we consider estimation algorithms augmented with a machine-learning-based predictor that, given any element, returns an estimation of its frequency. We show that if the predictor is correct up to a constant approximation factor, then the sample complexity can be reduced significantly, to log(1/ε)·n1−Θ(1/log(1/ε)).We evaluate the proposed algorithms on a collection of data sets, using the neural-network based estimators from Hsu et al, ICLR’19 as predictors. Our experiments demonstrate substantial (up to 3x) improvements in the estimation accuracy com-pared to the state of the art algorithm.
more » « less
Full Text Available
On approximating the number of k-cliques in sublinear time

https://doi.org/10.1145/3188745.3188810

Eden, Talya; Ron, Dana; Seshadhri, C. (January 2018, Symposlum on Theory of Computing (STOC))

Full Text Available

Search for: All records