skip to main content


Title: Proportional Volume Sampling and Approximation Algorithms for A-Optimal Design
We study the A-optimal design problem where we are given vectors υ1, …, υn ∊ ℝd, an integer k ≥ d, and the goal is to select a set S of k vectors that minimizes the trace of (∑i∊Svivi⊺)−1. Traditionally, the problem is an instance of optimal design of experiments in statistics [35] where each vector corresponds to a linear measurement of an unknown vector and the goal is to pick k of them that minimize the average variance of the error in the maximum likelihood estimate of the vector being measured. The problem also finds applications in sensor placement in wireless networks [22], sparse least squares regression [8], feature selection for k-means clustering [9], and matrix approximation [13, 14, 5]. In this paper, we introduce proportional volume sampling to obtain improved approximation algorithms for A-optimal design. Given a matrix, proportional volume sampling involves picking a set of columns S of size k with probability proportional to µ(S) times det(∑i∊Svivi⊺) for some measure µ. Our main result is to show the approximability of the A-optimal design problem can be reduced to approximate independence properties of the measure µ. We appeal to hardcore distributions as candidate distributions µ that allow us to obtain improved approximation algorithms for the A-optimal design. Our results include a d-approximation when k = d, an (1 + ∊)-approximation when and -approximation when repetitions of vectors are allowed in the solution. We also consider generalization of the problem for k ≤ d and obtain a k-approximation. We also show that the proportional volume sampling algorithm gives approximation algorithms for other optimal design objectives (such as D-optimal design [36] and generalized ratio objective [27]) matching or improving previous best known results. Interestingly, we show that a similar guarantee cannot be obtained for the E-optimal design problem. We also show that the A-optimal design problem is NP-hard to approximate within a fixed constant when k = d.  more » « less
Award ID(s):
1717947
NSF-PAR ID:
10106914
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Symposium on Discrete Algorithms (SODA)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Experimental design is a classical area in statistics and has also found new applications in machine learning. In the combinatorial experimental design problem, the aim is to estimate an unknown m-dimensional vector x from linear measurements where a Gaussian noise is introduced in each measurement. The goal is to pick k out of the given n experiments so as to make the most accurate estimate of the unknown parameter x. Given a set S of chosen experiments, the most likelihood estimate x0 can be obtained by a least squares computation. One of the robust measures of error estimation is the D-optimality criterion which aims to minimize the generalized variance of the estimator. This corresponds to minimizing the volume of the standard confidence ellipsoid for the estimation error x − x0. The problem gives rise to two natural variants depending on whether repetitions of experiments is allowed or not. The latter variant, while being more general, has also found applications in geographical location of sensors. We show a close connection between approximation algorithms for the D-optimal design problem and constructions of approximately m-wise positively correlated distributions. This connection allows us to obtain first approximation algorithms for the D-optimal design problem with and without repetitions. We then consider the case when the number of experiments chosen is much larger than the dimension m and show one can obtain asymptotically optimal algorithms in this case. 
    more » « less
  2. We study optimal design problems in which the goal is to choose a set of linear measurements to obtain the most accurate estimate of an unknown vector. We study the [Formula: see text]-optimal design variant where the objective is to minimize the average variance of the error in the maximum likelihood estimate of the vector being measured. We introduce the proportional volume sampling algorithm to obtain nearly optimal bounds in the asymptotic regime when the number [Formula: see text] of measurements made is significantly larger than the dimension [Formula: see text] and obtain the first approximation algorithms whose approximation factor does not degrade with the number of possible measurements when [Formula: see text] is small. The algorithm also gives approximation guarantees for other optimal design objectives such as [Formula: see text]-optimality and the generalized ratio objective, matching or improving the previously best-known results. We further show that bounds similar to ours cannot be obtained for [Formula: see text]-optimal design and that [Formula: see text]-optimal design is NP-hard to approximate within a fixed constant when [Formula: see text]. 
    more » « less
  3. Determinant maximization problem gives a general framework that models problems arising in as diverse fields as statistics [Puk06], convex geometry [Kha96], fair allocations [AGSS16], combinatorics [AGV18], spectral graph theory [NST19a], network design, and random processes [KT12]. In an instance of a determinant maximization problem, we are given a collection of vectors U = {v1, . . . , vn} ⊂ Rd , and a goal is to pick a subset S ⊆ U of given vectors to maximize the determinant of the matrix ∑i∈S vivi^T. Often, the set S of picked vectors must satisfy additional combinatorial constraints such as cardinality constraint (|S| ≤ k) or matroid constraint (S is a basis of a matroid defined on the vectors). In this paper, we give a polynomial-time deterministic algorithm that returns a r O(r)-approximation for any matroid of rank r ≤ d. This improves previous results that give e O(r^2)-approximation algorithms relying on e^O(r)-approximate estimation algorithms [NS16, AG17,AGV18, MNST20] for any r ≤ d. All previous results use convex relaxations and their relationship to stable polynomials and strongly log-concave polynomials. In contrast, our algorithm builds on combinatorial algorithms for matroid intersection, which iteratively improve any solution by finding an alternating negative cycle in the exchange graph defined by the matroids. While the det(.) function is not linear, we show that taking appropriate linear approximations at each iteration suffice to give the improved approximation algorithm. 
    more » « less
  4. We explore algorithms and limitations for sparse optimization problems such as sparse linear regression and robust linear regression. The goal of the sparse linear regression problem is to identify a small number of key features, while the goal of the robust linear regression problem is to identify a small number of erroneous measurements. Specifically, the sparse linear regression problem seeks a k-sparse vector x ∈ Rd to minimize ‖Ax − b‖2, given an input matrix A ∈ Rn×d and a target vector b ∈ Rn, while the robust linear regression problem seeks a set S that ignores at most k rows and a vector x to minimize ‖(Ax − b)S ‖2. We first show bicriteria, NP-hardness of approximation for robust regression building on the work of [OWZ15] which implies a similar result for sparse regression. We further show fine-grained hardness of robust regression through a reduction from the minimum-weight k-clique conjecture. On the positive side, we give an algorithm for robust regression that achieves arbitrarily accurate additive error and uses runtime that closely matches the lower bound from the fine-grained hardness result, as well as an algorithm for sparse regression with similar runtime. Both our upper and lower bounds rely on a general reduction from robust linear regression to sparse regression that we introduce. Our algorithms, inspired by the 3SUM problem, use approximate nearest neighbor data structures and may be of independent interest for solving sparse optimization problems. For instance, we demonstrate that our techniques can also be used for the well-studied sparse PCA problem. 
    more » « less
  5. Hazay, Carmit ; Stam, Martijn (Ed.)
    We study the computational problem of finding a shortest non-zero vector in a rotation of ℤ𝑛 , which we call ℤ SVP. It has been a long-standing open problem to determine if a polynomial-time algorithm for ℤ SVP exists, and there is by now a beautiful line of work showing how to solve it efficiently in certain very special cases. However, despite all of this work, the fastest known algorithm that is proven to solve ℤ SVP is still simply the fastest known algorithm for solving SVP (i.e., the problem of finding shortest non-zero vectors in arbitrary lattices), which runs in 2𝑛+𝑜(𝑛) time. We therefore set aside the (perhaps impossible) goal of finding an efficient algorithm for ℤ SVP and instead ask what else we can say about the problem. E.g., can we find any non-trivial speedup over the best known SVP algorithm? And, if ℤ SVP actually is hard, then what consequences would follow? Our results are as follows. We show that ℤ SVP is in a certain sense strictly easier than SVP on arbitrary lattices. In particular, we show how to reduce ℤ SVP to an approximate version of SVP in the same dimension (in fact, even to approximate unique SVP, for any constant approximation factor). Such a reduction seems very unlikely to work for SVP itself, so we view this as a qualitative separation of ℤ SVP from SVP. As a consequence of this reduction, we obtain a 2𝑛/2+𝑜(𝑛) -time algorithm for ℤ SVP, i.e., the first non-trivial speedup over the best known algorithm for SVP on general lattices. (In fact, this reduction works for a more general class of lattices—semi-stable lattices with not-too-large 𝜆1 .) We show a simple public-key encryption scheme that is secure if (an appropriate variant of) ℤ SVP is actually hard. Specifically, our scheme is secure if it is difficult to distinguish (in the worst case) a rotation of ℤ𝑛 from either a lattice with all non-zero vectors longer than 𝑛/log𝑛‾‾‾‾‾‾‾√ or a lattice with smoothing parameter significantly smaller than the smoothing parameter of ℤ𝑛 . The latter result has an interesting qualitative connection with reverse Minkowski theorems, which in some sense say that “ℤ𝑛 has the largest smoothing parameter.” We show a distribution of bases 𝐁 for rotations of ℤ𝑛 such that, if ℤ SVP is hard for any input basis, then ℤ SVP is hard on input 𝐁 . This gives a satisfying theoretical resolution to the problem of sampling hard bases for ℤ𝑛 , which was studied by Blanks and Miller [9]. This worst-case to average-case reduction is also crucially used in the analysis of our encryption scheme. (In recent independent work that appeared as a preprint before this work, Ducas and van Woerden showed essentially the same thing for general lattices [15], and they also used this to analyze the security of a public-key encryption scheme. Similar ideas also appeared in [5, 11, 20] in different contexts.) We perform experiments to determine how practical basis reduction performs on bases of ℤ𝑛 that are generated in different ways and how heuristic sieving algorithms perform on ℤ𝑛 . Our basis reduction experiments complement and add to those performed by Blanks and Miller, as we work with a larger class of algorithms (i.e., larger block sizes) and study the “provably hard” distribution of bases described above. Our sieving experiments confirm that heuristic sieving algorithms perform as expected on ℤ𝑛 . 
    more » « less