We consider the problem of designing sublinear time algorithms for estimating the cost of minimum] metric traveling salesman (TSP) tour. Specifically, given access to a n × n distance matrix D that specifies pairwise distances between n points, the goal is to estimate the TSP cost by performing only sublinear (in the size of D) queries. For the closely related problem of estimating the weight of
a metric minimum spanning tree (MST), it is known that for any epsilon > 0, there exists an O^~(n/epsilon^O(1))time algorithm that returns a (1+epsilon)approximate estimate of the MST cost. This result immediately
implies an O^~(n/epsilon^O(1)) time algorithm to estimate the TSP cost to within a (2 + epsilon) factor for any
epsilon > 0. However, no o(n^2)time algorithms are known to approximate metric TSP to a factor that is
strictly better than 2. On the other hand, there were also no known barriers that rule out existence of (1 + epsilon)approximate estimation algorithms for metric TSP with O^~ (n) time for any fixed epsilon > 0. In this paper, we make progress on both algorithms and lower bounds for estimating metric TSP cost.
On the algorithmic side, we first consider the graphic TSP problem where the metric D corresponds to shortest path distances in a connected unweighted undirected graph. We show that there exists an O^~(n) time algorithm that estimates the cost of graphic TSP to within a factor of (2 − epsilon_0) for some epsilon_0 > 0. This is the first sublinear cost estimation algorithm for graphic TSP that
achieves an approximation factor less than 2. We also consider another wellstudied special case of metric TSP, namely, (1, 2)TSP where all distances are either 1 or 2, and give an O^~(n ^ 1.5) time algorithm to estimate optimal cost to within a factor of 1.625. Our estimation algorithms for graphic
TSP as well as for (1, 2)TSP naturally lend themselves to O^~(n) space streaming algorithms that give an 11/6approximation for graphic TSP and a 1.625approximation for (1, 2)TSP. These results
motivate the natural question if analogously to metric MST, for any epsilon > 0, (1 + epsilon)approximate
estimates can be obtained for graphic TSP and (1, 2)TSP using O^~ (n) queries. We answer this
question in the negative – there exists an epsilon_0 > 0, such that any algorithm that estimates the cost of
graphic TSP ((1, 2)TSP) to within a (1 + epsilon_0)factor, necessarily requires
(n^2) queries. This lower bound result highlights a sharp separation between the metric MST and metric TSP problems. Similarly to many classical approximation algorithms for TSP, our sublinear time estimation algorithms utilize subroutines for estimating the size of a maximum matching in the underlying
graph. We show that this is not merely an artifact of our approach,
and that for any epsilon > 0, any
algorithm that estimates the cost of graphic TSP or (1, 2)TSP to within a (1 + epsilon)factor, can also
be used to estimate the size of a maximum matching in a bipartite graph to within an epsilon n additive
error. This connection allows us to translate known lower bounds for matching size estimation in
various models to similar lower bounds for metric TSP cost estimation.
more »
« less
Robust estimation via generalized quasigradients
Abstract We explore why many recently proposed robust estimation problems are efficiently solvable, even though the underlying optimization problems are nonconvex. We study the loss landscape of these robust estimation problems, and identify the existence of ’generalized quasigradients’. Whenever these quasigradients exist, a large family of noregret algorithms are guaranteed to approximate the global minimum; this includes the commonly used filtering algorithm. For robust mean estimation of distributions under bounded covariance, we show that any firstorder stationary point of the associated optimization problem is an approximate global minimum if and only if the corruption level $\epsilon < 1/3$. Consequently, any optimization algorithm that approaches a stationary point yields an efficient robust estimator with breakdown point $1/3$. With carefully designed initialization and step size, we improve this to $1/2$, which is optimal. For other tasks, including linear regression and joint mean and covariance estimation, the loss landscape is more rugged: there are stationary points arbitrarily far from the global minimum. Nevertheless, we show that generalized quasigradients exist and construct efficient algorithms. These algorithms are simpler than previous ones in the literature, and for linear regression we improve the estimation error from $O(\sqrt{\epsilon })$ to the optimal rate of $O(\epsilon )$ for small $\epsilon $ assuming certified hypercontractivity. For mean estimation with nearidentity covariance, we show that a simple gradient descent algorithm achieves breakdown point $1/3$ and iteration complexity $\tilde{O}(d/\epsilon ^2)$.
more »
« less
 NSFPAR ID:
 10338581
 Date Published:
 Journal Name:
 Information and Inference: A Journal of the IMA
 Volume:
 11
 Issue:
 2
 ISSN:
 20498772
 Page Range / eLocation ID:
 581 to 636
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this


We explore the connection between outlierrobust highdimensional statistics and nonconvex optimization in the presence of sparsity constraints, with a focus on the fundamental tasks of robust sparse mean estimation and robust sparse PCA. We develop novel and simple optimization formulations for these problems such that any approximate stationary point of the associated optimization problem yields a nearoptimal solution for the underlying robust estimation task. As a corollary, we obtain that any firstorder method that efficiently converges to stationarity yields an efficient algorithm for these tasks.1 The obtained algorithms are simple, practical, and succeed under broader distributional assumptions compared to prior work.more » « less

null (Ed.)We present a general framework of designing efficient dynamic approximate algorithms for optimization on undirected graphs. In particular, we develop a technique that, given any problem that admits a certain notion of vertex sparsifiers, gives data structures that maintain approximate solutions in sublinear update and query time. We illustrate the applicability of our paradigm to the following problems. (1) A fullydynamic algorithm that approximates allpair maximumflows/minimumcuts up to a nearly logarithmic factor in $\tilde{O}(n^{2/3})$ amortized time against an oblivious adversary, and $\tilde{O}(m^{3/4})$ time against an adaptive adversary. (2) An incremental data structure that maintains $O(1)$approximate shortest path in $n^{o(1)}$ time per operation, as well as fully dynamic approximate allpair shortest path and transshipment in $\tilde{O}(n^{2/3+o(1)})$ amortized time per operation. (3) A fullydynamic algorithm that approximates allpair effective resistance up to an $(1+\eps)$ factor in $\tilde{O}(n^{2/3+o(1)} \epsilon^{O(1)})$ amortized update time per operation. The key tool behind result (1) is the dynamic maintenance of an algorithmic construction due to Madry [FOCS' 10], which partitions a graph into a collection of simpler graph structures (known as jtrees) and approximately captures the cutflow and metric structure of the graph. The $O(1)$approximation guarantee of (2) is by adapting the distance oracles by [ThorupZwick JACM `05]. Result (3) is obtained by invoking the randomwalk based spectral vertex sparsifier by [Durfee et al. STOC `19] in a hierarchical manner, while carefully keeping track of the recourse among levels in the hierarchy.more » « less

This work proposes a new algorithm – the Singletimescale Doublemomentum Stochastic Approximation (SUSTAIN) –for tackling stochastic unconstrained bilevel optimization problems. We focus on bilevel problems where the lower level subproblem is stronglyconvex and the upper level objective function is smooth. Unlike prior works which rely on twotimescale or double loop techniques, we design a stochastic momentumassisted gradient estimator for both the upper and lower level updates. The latter allows us to control the error in the stochastic gradient updates due to inaccurate solution to both subproblems. If the upper objective function is smooth but possibly nonconvex, we show that SUSTAIN requires ${O}(\epsilon^{3/2})$ iterations (each using $O(1)$ samples) to find an $\epsilon$stationary solution. The $\epsilon$stationary solution is defined as the point whose squared norm of the gradient of the outer function is less than or equal to $\epsilon$. The total number of stochastic gradient samples required for the upper and lower level objective functions match the bestknown complexity for singlelevel stochastic gradient algorithms. We also analyze the case when the upper level objective function is stronglyconvex.more » « less

The iteratively reweighted least squares method (IRLS) is a popular technique used in practice for solving regression problems. Various versions of this method have been proposed, but their theoretical analyses failed to capture the good practical performance. In this paper we propose a simple and natural version of IRLS for solving $\ell_\infty$ and $\ell_1$ regression, which provably converges to a $(1+\epsilon)$approximate solution in $O(m^{1/3}\log(1/\epsilon)/\epsilon^{2/3} + \log m/\epsilon^2)$ iterations, where $m$ is the number of rows of the input matrix. Interestingly, this running time is independent of the conditioning of the input, and the dominant term of the running time depends sublinearly in $\epsilon^{1}$, which is atypical for the optimization of nonsmooth functions. This improves upon the more complex algorithms of Chin et al. (ITCS '12), and Christiano et al. (STOC '11) by a factor of at least $1/\epsilon^2$, and yields a truly efficient natural algorithm for the slime mold dynamics (StraszakVishnoi, SODA '16, ITCS '16, ITCS '17).more » « less