We study local symmetry breaking problems in the Congest model, focusing on ruling set problems, which generalize the fundamental Maximal Independent Set (MIS) problem. The time (round) complexity of MIS (and ruling sets) have attracted much attention in the Local model. Indeed, recent results (Barenboim et al., FOCS 2012, Ghaffari SODA 2016) for the MIS problem have tried to break the long-standing O(log n)-round "barrier" achieved by Luby's algorithm, but these yield o(log n)-round complexity only when the maximum degree Delta is somewhat small relative to n. More importantly, these results apply only in the Local model. In fact, the best known time bound in the Congest model is still O(log n) (via Luby's algorithm) even for moderately small Delta (i.e., for Delta = Omega(log n) and Delta = o(n)). Furthermore, message complexity has been largely ignored in the context of local symmetry breaking. Luby's algorithm takes O(m) messages on m-edge graphs and this is the best known bound with respect to messages. Our work is motivated by the following central question: can we break the Theta(log n) time complexity barrier and the Theta(m) message complexity barrier in the Congest model for MIS or closely-related symmetry breaking problems? This paper presents progress towards this question for the distributed ruling set problem in the Congest model. A beta-ruling set is an independent set such that every node in the graph is at most beta hops from a node in the independent set. We present the following results: - Time Complexity: We show that we can break the O(log n) "barrier" for 2- and 3-ruling sets. We compute 3-ruling sets in O(log n/log log n) rounds with high probability (whp). More generally we show that 2-ruling sets can be computed in O(log Delta (log n)^(1/2 + epsilon) + log n/log log n) rounds for any epsilon > 0, which is o(log n) for a wide range of Delta values (e.g., Delta = 2^(log n)^(1/2-epsilon)). These are the first 2- and 3-ruling set algorithms to improve over the O(log n)-round complexity of Luby's algorithm in the Congest model. - Message Complexity: We show an Omega(n^2) lower bound on the message complexity of computing an MIS (i.e., 1-ruling set) which holds also for randomized algorithms and present a contrast to this by showing a randomized algorithm for 2-ruling sets that, whp, uses only O(n log^2 n) messages and runs in O(Delta log n) rounds. This is the first message-efficient algorithm known for ruling sets, which has message complexity nearly linear in n (which is optimal up to a polylogarithmic factor).
more »
« less
On the minimax rate of the Gaussian sequence model under bounded convex constraints
We determine the exact minimax rate of a Gaussian sequence model under bounded convex constraints, purely in terms of the local geometry of the given constraint set $$K$$. Our main result shows that the minimax risk (up to constant factors) under the squared $$L_2$$ loss is given by $$\epsilon^{*2} \wedge \operatorname{diam}(K)^2$$ with \begin{align*} \epsilon^* = \sup \bigg\{\epsilon : \frac{\epsilon^2}{\sigma^2} \leq \log M^{\operatorname{loc}}(\epsilon)\bigg\}, \end{align*} where $$\log M^{\operatorname{loc}}(\epsilon)$$ denotes the local entropy of the set $$K$$, and $$\sigma^2$$ is the variance of the noise. We utilize our abstract result to re-derive known minimax rates for some special sets $$K$$ such as hyperrectangles, ellipses, and more generally quadratically convex orthosymmetric sets. Finally, we extend our results to the unbounded case with known $$\sigma^2$$ to show that the minimax rate in that case is $$\epsilon^{*2}$$.
more »
« less
- Award ID(s):
- 2113684
- PAR ID:
- 10328066
- Date Published:
- Journal Name:
- ArXivorg
- ISSN:
- 2331-8422
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
In this paper, we study the finite-sum convex optimization problem focusing on the general convex case. Recently, the study of variance reduced (VR) methods and their accelerated variants has made exciting progress. However, the step size used in the existing VR algorithms typically depends on the smoothness parameter, which is often unknown and requires tuning in practice. To address this problem, we propose two novel adaptive VR algorithms: Adaptive Variance Reduced Accelerated Extra-Gradient (AdaVRAE) and Adaptive Variance Reduced Accelerated Gradient (AdaVRAG). Our algorithms do not require knowledge of the smoothness parameter. AdaVRAE uses $$\mathcal{O}\left(n\log\log n+\sqrt{\frac{n\beta}{\epsilon}}\right)$$ and AdaVRAG uses $$\mathcal{O}\left(n\log\log n+\sqrt{\frac{n\beta\log\beta}{\epsilon}}\right)$$ gradient evaluations to attain an $$\mathcal{O}(\epsilon)$$-suboptimal solution, where $$n$$ is the number of functions in the finite sum and $$\beta$$ is the smoothness parameter. This result matches the best-known convergence rate of non-adaptive VR methods and it improves upon the convergence of the state of the art adaptive VR method, AdaSVRG. We demonstrate the superior performance of our algorithms compared with previous methods in experiments on real-world datasets.more » « less
-
In this paper we provide anO(mloglogO(1)nlog (1/ϵ))-expected time algorithm for solving Laplacian systems onn-nodem-edge graphs, improving upon the previous best expected runtime of\(O(m \sqrt {\log n} \mathrm{log log}^{O(1)} n \log (1/\epsilon)) \)achieved by (Cohen, Kyng, Miller, Pachocki, Peng, Rao, Xu 2014). To obtain this result we provide efficient constructions of low spectral stretch graph approximations with improved stretch and sparsity bounds. As motivation for this work, we show that for every set of vectors in\(\mathbb {R}^d \)(not just those induced by graphs) and all integerk> 1 there exist an ultra-sparsifier withd− 1 +O(d/k) re-weighted vectors of relative condition number at mostk2. For smallk, this improves upon the previous best known multiplicative factor of\(k \cdot \tilde{O}(\log d) \), which is only known for the graph case. Additionally, in the graph case we employ our low-stretch subgraph construction to obtainn− 1 +O(n/k)-edge ultrasparsifiers of relative condition numberk1 +o(1)fork=ω(log δn) for anyδ> 0: this improves upon the previous work fork=o(exp (log 1/2 −δn)).more » « less
-
In this work, we describe a generic approach to show convergence with high probability for both stochastic convex and non-convex optimization with sub-Gaussian noise. In previous works for convex optimization, either the convergence is only in expectation or the bound depends on the diameter of the domain. Instead, we show high probability convergence with bounds depending on the initial distance to the optimal solution. The algorithms use step sizes analogous to the standard settings and are universal to Lipschitz functions, smooth functions, and their linear combinations. The method can be applied to the non-convex case. We demonstrate an $$O((1+\sigma^{2}\log(1/\delta))/T+\sigma/\sqrt{T})$$ convergence rate when the number of iterations $$T$$ is known and an $$O((1+\sigma^{2}\log(T/\delta))/\sqrt{T})$$ convergence rate when $$T$$ is unknown for SGD, where $$1-\delta$$ is the desired success probability. These bounds improve over existing bounds in the literature. We also revisit AdaGrad-Norm (Ward et al., 2019) and show a new analysis to obtain a high probability bound that does not require the bounded gradient assumption made in previous works. The full version of our paper contains results for the standard per-coordinate AdaGrad.more » « less
-
In this work, we describe a generic approach to show convergence with high probability for both stochastic convex and non-convex optimization with sub-Gaussian noise. In previous works for convex optimization, either the convergence is only in expectation or the bound depends on the diameter of the domain. Instead, we show high probability convergence with bounds depending on the initial distance to the optimal solution. The algorithms use step sizes analogous to the standard settings and are universal to Lipschitz functions, smooth functions, and their linear combinations. The method can be applied to the non-convex case. We demonstrate an $$O((1+\sigma^{2}\log(1/\delta))/T+\sigma/\sqrt{T})$$ convergence rate when the number of iterations $$T$$ is known and an $$O((1+\sigma^{2}\log(T/\delta))/\sqrt{T})$$ convergence rate when $$T$$ is unknown for SGD, where $$1-\delta$$ is the desired success probability. These bounds improve over existing bounds in the literature. We also revisit AdaGrad-Norm \cite{ward2019adagrad} and show a new analysis to obtain a high probability bound that does not require the bounded gradient assumption made in previous works. The full version of our paper contains results for the standard per-coordinate AdaGrad.more » « less
An official website of the United States government

