Experimental design is a classical area in statistics and has also found new applications in machine learning. In the combinatorial experimental design problem, the aim is to estimate an unknown mdimensional vector x from linear measurements where a Gaussian noise is introduced in each measurement. The goal is to pick k out of the given n experiments so as to make the most accurate estimate of the unknown parameter x. Given a set S of chosen experiments, the most likelihood estimate x0 can be obtained by a least squares computation. One of the robust measures of error estimation is themore »
Proportional Volume Sampling and Approximation Algorithms for AOptimal Design
We study the Aoptimal design problem where we are given vectors υ1, …, υn ∊ ℝd, an integer k ≥ d, and the goal is to select a set S of k vectors that minimizes the trace of (∑i∊Svivi⊺)−1. Traditionally, the problem is an instance of optimal design of experiments in statistics [35] where each vector corresponds to a linear measurement of an unknown vector and the goal is to pick k of them that minimize the average variance of the error in the maximum likelihood estimate of the vector being measured. The problem also finds applications in sensor placement in wireless networks [22], sparse least squares regression [8], feature selection for kmeans clustering [9], and matrix approximation [13, 14, 5]. In this paper, we introduce proportional volume sampling to obtain improved approximation algorithms for Aoptimal design.
Given a matrix, proportional volume sampling involves picking a set of columns S of size k with probability proportional to µ(S) times det(∑i∊Svivi⊺) for some measure µ. Our main result is to show the approximability of the Aoptimal design problem can be reduced to approximate independence properties of the measure µ. We appeal to hardcore distributions as candidate distributions µ that allow us to more »
 Award ID(s):
 1717947
 Publication Date:
 NSFPAR ID:
 10106914
 Journal Name:
 Symposium on Discrete Algorithms (SODA)
 Sponsoring Org:
 National Science Foundation
More Like this


We explore algorithms and limitations for sparse optimization problems such as sparse linear regression and robust linear regression. The goal of the sparse linear regression problem is to identify a small number of key features, while the goal of the robust linear regression problem is to identify a small number of erroneous measurements. Specifically, the sparse linear regression problem seeks a ksparse vector x ∈ Rd to minimize ‖Ax − b‖2, given an input matrix A ∈ Rn×d and a target vector b ∈ Rn, while the robust linear regression problem seeks a set S that ignores at most kmore »

We consider the communication complexity of a number of distributed optimization problems. We start with the problem of solving a linear system. Suppose there is a coordinator together with s servers P1, …, Ps, the ith of which holds a subset A(i) x = b(i) of ni constraints of a linear system in d variables, and the coordinator would like to output an x ϵ ℝd for which A(i) x = b(i) for i = 1, …, s. We assume each coefficient of each constraint is specified using L bits. We first resolve the randomized and deterministic communication complexity inmore »

In an optimal design problem, we are given a set of linear experiments v1,…,vn∈Rd and k≥d, and our goal is to select a set or a multiset S⊆[n] of size k such that Φ((∑i∈Sviv⊤i)−1) is minimized. When Φ(M)=Determinant(M)1/d, the problem is known as the Doptimal design problem, and when Φ(M)=Trace(M), it is known as the Aoptimal design problem. One of the most common heuristics used in practice to solve these problems is the local search heuristic, also known as the Fedorov’s exchange method (Fedorov, 1972). This is due to its simplicity and its empirical performance (Cook and Nachtrheim, 1980; Millermore »

Experimental design is a classical statistics problem, and its aim is to estimate an unknown vector from linear measurements where a Gaussian noise is introduced in each measurement. For the combinatorial experimental design problem, the goal is to pick a subset of experiments so as to make the most accurate estimate of the unknown parameters. In this paper, we will study one of the most robust measures of error estimation—the Doptimality criterion, which corresponds to minimizing the volume of the confidence ellipsoid for the estimation error. The problem gives rise to two natural variants depending on whether repetitions of experimentsmore »