We investigate the approximability of the following optimization problem. The input is an n× n matrix A=(Aij) with real entries and an originsymmetric convex body K⊂ ℝn that is given by a membership oracle. The task is to compute (or approximate) the maximum of the quadratic form ∑i=1n∑j=1n Aij xixj=⟨ x,Ax⟩ as x ranges over K. This is a rich and expressive family of optimization problems; for different choices of matrices A and convex bodies K it includes a diverse range of optimization problems like maxcut, Grothendieck/noncommutative Grothendieck inequalities, small set expansion and more. While the literature studied these special cases using casespecific reasoning, here we develop a general methodology for treatment of the approximability and inapproximability aspects of these questions. The underlying geometry of K plays a critical role; we show under commonly used complexity assumptions that polytime constantapproximability necessitates that K has type2 constant that grows slowly with n. However, we show that even when the type2 constant is bounded, this problem sometimes exhibits strong hardness of approximation. Thus, even within the realm of type2 bodies, the approximability landscape is nuanced and subtle. However, the link that we establish between optimization and geometry of Banach spaces allows usmore »
This content will become publicly available on January 1, 2023
Hardness and Algorithms for Robust and Sparse Optimization 162:1792617944, 2022.
We explore algorithms and limitations for sparse optimization problems such as sparse linear regression and robust linear regression. The goal of the sparse linear regression problem is to identify a small number of key features, while the goal of the robust linear regression problem is to identify a small number of erroneous measurements. Specifically, the sparse linear regression problem seeks a ksparse vector x ∈ Rd to minimize ‖Ax − b‖2, given an input matrix A ∈ Rn×d and a target vector b ∈ Rn, while the robust linear regression problem seeks a set S that ignores at most k rows and a vector x to minimize ‖(Ax − b)S ‖2. We first show bicriteria, NPhardness of approximation for robust regression building on the work of [OWZ15] which implies a similar result for sparse regression. We further show finegrained hardness of robust regression through a reduction from the minimumweight kclique conjecture. On the positive side, we give an algorithm for robust regression that achieves arbitrarily accurate additive error and uses runtime that closely matches the lower bound from the finegrained hardness result, as well as an algorithm for sparse regression with similar runtime. Both our upper and lower bounds rely more »
 Award ID(s):
 2022448
 Publication Date:
 NSFPAR ID:
 10341762
 Journal Name:
 Proceedings of the 39th International Conference on Machine Learning (PMLR)
 Issue:
 162
 Page Range or eLocationID:
 1792617944,
 Sponsoring Org:
 National Science Foundation
More Like this


We study the Aoptimal design problem where we are given vectors υ1, …, υn ∊ ℝd, an integer k ≥ d, and the goal is to select a set S of k vectors that minimizes the trace of (∑i∊Svivi⊺)−1. Traditionally, the problem is an instance of optimal design of experiments in statistics [35] where each vector corresponds to a linear measurement of an unknown vector and the goal is to pick k of them that minimize the average variance of the error in the maximum likelihood estimate of the vector being measured. The problem also finds applications in sensor placement in wireless networks [22], sparse least squares regression [8], feature selection for kmeans clustering [9], and matrix approximation [13, 14, 5]. In this paper, we introduce proportional volume sampling to obtain improved approximation algorithms for Aoptimal design. Given a matrix, proportional volume sampling involves picking a set of columns S of size k with probability proportional to µ(S) times det(∑i∊Svivi⊺) for some measure µ. Our main result is to show the approximability of the Aoptimal design problem can be reduced to approximate independence properties of the measure µ. We appeal to hardcore distributions as candidate distributions µ that allow usmore »

https://arxiv.org/abs/2007.14539 As in standard linear regression, in truncated linear regression, we are given access to observations (Ai,yi)i whose dependent variable equals yi=ATi⋅x∗+ηi, where x∗ is some fixed unknown vector of interest and ηi is independent noise; except we are only given an observation if its dependent variable yi lies in some "truncation set" S⊂ℝ. The goal is to recover x∗ under some favorable conditions on the Ai's and the noise distribution. We prove that there exists a computationally and statistically efficient method for recovering ksparse ndimensional vectors x∗ from m truncated samples, which attains an optimal ℓ2 reconstruction error of O((klogn)/m‾‾‾‾‾‾‾‾‾‾√). As a corollary, our guarantees imply a computationally efficient and informationtheoretically optimal algorithm for compressed sensing with truncation, which may arise from measurement saturation effects. Our result follows from a statistical and computational analysis of the Stochastic Gradient Descent (SGD) algorithm for solving a natural adaptation of the LASSO optimization problem that accommodates truncation. This generalizes the works of both: (1) [Daskalakis et al. 2018], where no regularization is needed due to the lowdimensionality of the data, and (2) [Wainright 2009], where the objective function is simple due to the absence of truncation. In order to deal with bothmore »

We consider partiallyspecified optimization problems where the goal is to actively, but efficiently, acquire missing information about the problem in order to solve it. An algo rithm designer wishes to solve a linear pro gram (LP), maxcT x s.t. Ax ≤ b,x ≥ 0, but does not initially know some of the pa rameters. The algorithm can iteratively choose an unknown parameter and gather information in the form of a noisy sample centered at the parameter’s (unknown) value. The goal is to find an approximately feasible and optimal so lution to the underlying LP with high proba bility while drawing a small number of sam ples. We focus on two cases. (1) When the parameters b of the constraints are initially un known, we propose an efficient algorithm com bining techniques from the ellipsoid method for LP and confidencebound approaches from bandit algorithms. The algorithm adaptively gathers information about constraints only as needed in order to make progress. We give sample complexity bounds for the algorithm and demonstrate its improvement over a naive approach via simulation. (2) When the param eters c of the objective are initially unknown, we take an informationtheoretic approach and give roughly matching upper and lowermore »

Quantum computational supremacy arguments, which describe a way for a quantum computer to perform a task that cannot also be done by a classical computer, typically require some sort of computational assumption related to the limitations of classical computation. One common assumption is that the polynomial hierarchy ( P H ) does not collapse, a stronger version of the statement that P ≠ N P , which leads to the conclusion that any classical simulation of certain families of quantum circuits requires time scaling worse than any polynomial in the size of the circuits. However, the asymptotic nature of this conclusion prevents us from calculating exactly how many qubits these quantum circuits must have for their classical simulation to be intractable on modern classical supercomputers. We refine these quantum computational supremacy arguments and perform such a calculation by imposing finegrained versions of the noncollapse conjecture. Our first two conjectures poly3NSETH( a ) and perintNSETH( b ) take specific classical counting problems related to the number of zeros of a degree3 polynomial in n variables over F 2 or the permanent of an n × n integervalued matrix, and assert that any nondeterministic algorithm that solves them requires 2 c nmore »