We consider the problem of designing sublinear time algorithms for estimating the cost of minimum] metric traveling salesman (TSP) tour. Specifically, given access to a n × n distance matrix D that specifies pairwise distances between n points, the goal is to estimate the TSP cost by performing only sublinear (in the size of D) queries. For the closely related problem of estimating the weight of a metric minimum spanning tree (MST), it is known that for any epsilon > 0, there exists an O^~(n/epsilon^O(1))time algorithm that returns a (1+epsilon)approximate estimate of the MST cost. This result immediately implies an O^~(n/epsilon^O(1)) time algorithm to estimate the TSP cost to within a (2 + epsilon) factor for any epsilon > 0. However, no o(n^2)time algorithms are known to approximate metric TSP to a factor that is strictly better than 2. On the other hand, there were also no known barriers that rule out existence of (1 + epsilon)approximate estimation algorithms for metric TSP with O^~ (n) time for any fixed epsilon > 0. In this paper, we make progress on both algorithms and lower bounds for estimating metric TSP cost. On the algorithmic side, we first consider the graphic TSP problemmore »
Robust estimation via generalized quasigradients
Abstract We explore why many recently proposed robust estimation problems are efficiently solvable, even though the underlying optimization problems are nonconvex. We study the loss landscape of these robust estimation problems, and identify the existence of ’generalized quasigradients’. Whenever these quasigradients exist, a large family of noregret algorithms are guaranteed to approximate the global minimum; this includes the commonly used filtering algorithm. For robust mean estimation of distributions under bounded covariance, we show that any firstorder stationary point of the associated optimization problem is an approximate global minimum if and only if the corruption level $\epsilon < 1/3$. Consequently, any optimization algorithm that approaches a stationary point yields an efficient robust estimator with breakdown point $1/3$. With carefully designed initialization and step size, we improve this to $1/2$, which is optimal. For other tasks, including linear regression and joint mean and covariance estimation, the loss landscape is more rugged: there are stationary points arbitrarily far from the global minimum. Nevertheless, we show that generalized quasigradients exist and construct efficient algorithms. These algorithms are simpler than previous ones in the literature, and for linear regression we improve the estimation error from $O(\sqrt{\epsilon })$ to the optimal rate of $O(\epsilon )$ for more »
 Publication Date:
 NSFPAR ID:
 10338581
 Journal Name:
 Information and Inference: A Journal of the IMA
 Volume:
 11
 Issue:
 2
 Page Range or eLocationID:
 581 to 636
 ISSN:
 20498772
 Sponsoring Org:
 National Science Foundation
More Like this


We present a general framework of designing efficient dynamic approximate algorithms for optimization on undirected graphs. In particular, we develop a technique that, given any problem that admits a certain notion of vertex sparsifiers, gives data structures that maintain approximate solutions in sublinear update and query time. We illustrate the applicability of our paradigm to the following problems. (1) A fullydynamic algorithm that approximates allpair maximumflows/minimumcuts up to a nearly logarithmic factor in $\tilde{O}(n^{2/3})$ amortized time against an oblivious adversary, and $\tilde{O}(m^{3/4})$ time against an adaptive adversary. (2) An incremental data structure that maintains $O(1)$approximate shortest path in $n^{o(1)}$ time per operation, as well as fully dynamic approximate allpair shortest path and transshipment in $\tilde{O}(n^{2/3+o(1)})$ amortized time per operation. (3) A fullydynamic algorithm that approximates allpair effective resistance up to an $(1+\eps)$ factor in $\tilde{O}(n^{2/3+o(1)} \epsilon^{O(1)})$ amortized update time per operation. The key tool behind result (1) is the dynamic maintenance of an algorithmic construction due to Madry [FOCS' 10], which partitions a graph into a collection of simpler graph structures (known as jtrees) and approximately captures the cutflow and metric structure of the graph. The $O(1)$approximation guarantee of (2) is by adapting the distance oracles by [ThorupZwick JACM `05].more »

The iteratively reweighted least squares method (IRLS) is a popular technique used in practice for solving regression problems. Various versions of this method have been proposed, but their theoretical analyses failed to capture the good practical performance. In this paper we propose a simple and natural version of IRLS for solving $\ell_\infty$ and $\ell_1$ regression, which provably converges to a $(1+\epsilon)$approximate solution in $O(m^{1/3}\log(1/\epsilon)/\epsilon^{2/3} + \log m/\epsilon^2)$ iterations, where $m$ is the number of rows of the input matrix. Interestingly, this running time is independent of the conditioning of the input, and the dominant term of the running time depends sublinearly in $\epsilon^{1}$, which is atypical for the optimization of nonsmooth functions. This improves upon the more complex algorithms of Chin et al. (ITCS '12), and Christiano et al. (STOC '11) by a factor of at least $1/\epsilon^2$, and yields a truly efficient natural algorithm for the slime mold dynamics (StraszakVishnoi, SODA '16, ITCS '16, ITCS '17).

This work proposes a new algorithm – the Singletimescale Doublemomentum Stochastic Approximation (SUSTAIN) –for tackling stochastic unconstrained bilevel optimization problems. We focus on bilevel problems where the lower level subproblem is stronglyconvex and the upper level objective function is smooth. Unlike prior works which rely on twotimescale or double loop techniques, we design a stochastic momentumassisted gradient estimator for both the upper and lower level updates. The latter allows us to control the error in the stochastic gradient updates due to inaccurate solution to both subproblems. If the upper objective function is smooth but possibly nonconvex, we show that SUSTAIN requires ${O}(\epsilon^{3/2})$ iterations (each using $O(1)$ samples) to find an $\epsilon$stationary solution. The $\epsilon$stationary solution is defined as the point whose squared norm of the gradient of the outer function is less than or equal to $\epsilon$. The total number of stochastic gradient samples required for the upper and lower level objective functions match the bestknown complexity for singlelevel stochastic gradient algorithms. We also analyze the case when the upper level objective function is stronglyconvex.

We explore the connection between outlierrobust highdimensional statistics and nonconvex optimization in the presence of sparsity constraints, with a focus on the fundamental tasks of robust sparse mean estimation and robust sparse PCA. We develop novel and simple optimization formulations for these problems such that any approximate stationary point of the associated optimization problem yields a nearoptimal solution for the underlying robust estimation task. As a corollary, we obtain that any firstorder method that efficiently converges to stationarity yields an efficient algorithm for these tasks. The obtained algorithms are simple, practical, and succeed under broader distributional assumptions compared to prior work.