We investigate the approximability of the following optimization problem. The input is an n× n matrix A=(Aij) with real entries and an origin-symmetric convex body K⊂ ℝn that is given by a membership oracle. The task is to compute (or approximate) the maximum of the quadratic form ∑i=1n∑j=1n Aij xixj=⟨ x,Ax⟩ as x ranges over K. This is a rich and expressive family of optimization problems; for different choices of matrices A and convex bodies K it includes a diverse range of optimization problems like max-cut, Grothendieck/non-commutative Grothendieck inequalities, small set expansion and more. While the literature studied these special cases using case-specific reasoning, here we develop a general methodology for treatment of the approximability and inapproximability aspects of these questions. The underlying geometry of K plays a critical role; we show under commonly used complexity assumptions that polytime constant-approximability necessitates that K has type-2 constant that grows slowly with n. However, we show that even when the type-2 constant is bounded, this problem sometimes exhibits strong hardness of approximation. Thus, even within the realm of type-2 bodies, the approximability landscape is nuanced and subtle. However, the link that we establish between optimization and geometry of Banach spaces allows usmore »
Hardness and Algorithms for Robust and Sparse Optimization 162:17926-17944, 2022.
We explore algorithms and limitations for sparse optimization problems such as sparse linear regression and robust linear regression. The goal of the sparse linear regression problem is to identify a small number of key features, while the goal of the robust linear regression problem is to identify a small number of erroneous measurements. Specifically, the sparse linear regression problem seeks a k-sparse vector x ∈ Rd to minimize ‖Ax − b‖2, given an input matrix A ∈ Rn×d and a target vector b ∈ Rn, while the robust linear regression problem seeks a set S that ignores at most k rows and a vector x to minimize ‖(Ax − b)S ‖2. We first show bicriteria, NP-hardness of approximation for robust regression building on the work of [OWZ15] which implies a similar result for sparse regression. We further show fine-grained hardness of robust regression through a reduction from the minimum-weight k-clique conjecture. On the positive side, we give an algorithm for robust regression that achieves arbitrarily accurate additive error and uses runtime that closely matches the lower bound from the fine-grained hardness result, as well as an algorithm for sparse regression with similar runtime. Both our upper and lower bounds rely more »
- Award ID(s):
- 2022448
- Publication Date:
- NSF-PAR ID:
- 10341762
- Journal Name:
- Proceedings of the 39th International Conference on Machine Learning (PMLR)
- Issue:
- 162
- Page Range or eLocation-ID:
- 17926-17944,
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
We study the A-optimal design problem where we are given vectors υ1, …, υn ∊ ℝd, an integer k ≥ d, and the goal is to select a set S of k vectors that minimizes the trace of (∑i∊Svivi⊺)−1. Traditionally, the problem is an instance of optimal design of experiments in statistics [35] where each vector corresponds to a linear measurement of an unknown vector and the goal is to pick k of them that minimize the average variance of the error in the maximum likelihood estimate of the vector being measured. The problem also finds applications in sensor placement in wireless networks [22], sparse least squares regression [8], feature selection for k-means clustering [9], and matrix approximation [13, 14, 5]. In this paper, we introduce proportional volume sampling to obtain improved approximation algorithms for A-optimal design. Given a matrix, proportional volume sampling involves picking a set of columns S of size k with probability proportional to µ(S) times det(∑i∊Svivi⊺) for some measure µ. Our main result is to show the approximability of the A-optimal design problem can be reduced to approximate independence properties of the measure µ. We appeal to hardcore distributions as candidate distributions µ that allow usmore »
-
https://arxiv.org/abs/2007.14539 As in standard linear regression, in truncated linear regression, we are given access to observations (Ai,yi)i whose dependent variable equals yi=ATi⋅x∗+ηi, where x∗ is some fixed unknown vector of interest and ηi is independent noise; except we are only given an observation if its dependent variable yi lies in some "truncation set" S⊂ℝ. The goal is to recover x∗ under some favorable conditions on the Ai's and the noise distribution. We prove that there exists a computationally and statistically efficient method for recovering k-sparse n-dimensional vectors x∗ from m truncated samples, which attains an optimal ℓ2 reconstruction error of O((klogn)/m‾‾‾‾‾‾‾‾‾‾√). As a corollary, our guarantees imply a computationally efficient and information-theoretically optimal algorithm for compressed sensing with truncation, which may arise from measurement saturation effects. Our result follows from a statistical and computational analysis of the Stochastic Gradient Descent (SGD) algorithm for solving a natural adaptation of the LASSO optimization problem that accommodates truncation. This generalizes the works of both: (1) [Daskalakis et al. 2018], where no regularization is needed due to the low-dimensionality of the data, and (2) [Wainright 2009], where the objective function is simple due to the absence of truncation. In order to deal with bothmore »
-
We consider partially-specified optimization problems where the goal is to actively, but efficiently, acquire missing information about the problem in order to solve it. An algo- rithm designer wishes to solve a linear pro- gram (LP), maxcT x s.t. Ax ≤ b,x ≥ 0, but does not initially know some of the pa- rameters. The algorithm can iteratively choose an unknown parameter and gather information in the form of a noisy sample centered at the parameter’s (unknown) value. The goal is to find an approximately feasible and optimal so- lution to the underlying LP with high proba- bility while drawing a small number of sam- ples. We focus on two cases. (1) When the parameters b of the constraints are initially un- known, we propose an efficient algorithm com- bining techniques from the ellipsoid method for LP and confidence-bound approaches from bandit algorithms. The algorithm adaptively gathers information about constraints only as needed in order to make progress. We give sample complexity bounds for the algorithm and demonstrate its improvement over a naive approach via simulation. (2) When the param- eters c of the objective are initially unknown, we take an information-theoretic approach and give roughly matching upper and lowermore »
-
Quantum computational supremacy arguments, which describe a way for a quantum computer to perform a task that cannot also be done by a classical computer, typically require some sort of computational assumption related to the limitations of classical computation. One common assumption is that the polynomial hierarchy ( P H ) does not collapse, a stronger version of the statement that P ≠ N P , which leads to the conclusion that any classical simulation of certain families of quantum circuits requires time scaling worse than any polynomial in the size of the circuits. However, the asymptotic nature of this conclusion prevents us from calculating exactly how many qubits these quantum circuits must have for their classical simulation to be intractable on modern classical supercomputers. We refine these quantum computational supremacy arguments and perform such a calculation by imposing fine-grained versions of the non-collapse conjecture. Our first two conjectures poly3-NSETH( a ) and per-int-NSETH( b ) take specific classical counting problems related to the number of zeros of a degree-3 polynomial in n variables over F 2 or the permanent of an n × n integer-valued matrix, and assert that any non-deterministic algorithm that solves them requires 2 c nmore »