 Award ID(s):
 2210583
 NSFPAR ID:
 10508719
 Publisher / Repository:
 PMLR
 Date Published:
 Journal Name:
 Proceedings of Machine Learning Research
 Volume:
 195
 ISSN:
 26403498
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this

Bienstock, D. (Ed.)Motivated by problems in optimization we study the sparsity of the solutions to systems of linear Diophantine equations and linear integer programs, i.e., the number of nonzero entries of a solution, which is often referred to as the β0norm. Our main results are improved bounds on the β0norm of sparse solutions to systems π΄π₯=π, where π΄ββ€πΓπ, πββ€π and π₯ is either a general integer vector (lattice case) or a nonnegative integer vector (semigroup case). In the lattice case and certain scenarios of the semigroup case, we give polynomial time algorithms for computing solutions with β0norm satisfying the obtained bounds.more » « less

Abstract Series of univariate distributions indexed by equally spaced time points are ubiquitous in applications and their analysis constitutes one of the challenges of the emerging field of distributional data analysis. To quantify such distributional time series, we propose a class of intrinsic autoregressive models that operate in the space of optimal transport maps. The autoregressive transport models that we introduce here are based on regressing optimal transport maps on each other, where predictors can be transport maps from an overall barycenter to a current distribution or transport maps between past consecutive distributions of the distributional time series. Autoregressive transport models and their associated distributional regression models specify the link between predictor and response transport maps by moving along geodesics in Wasserstein space. These models emerge as natural extensions of the classical autoregressive models in Euclidean space. Unique stationary solutions of autoregressive transport models are shown to exist under a geometric moment contraction condition of Wu & Shao [(2004) Limit theorems for iterated random functions. Journal of Applied Probability 41, 425β436)], using properties of iterated random functions. We also discuss an extension to a varying coefficient model for firstorder autoregressive transport models. In addition to simulations, the proposed models are illustrated with distributional time series of house prices across U.S. counties and annual summer temperature distributions.

Determinant maximization problem gives a general framework that models problems arising in as diverse fields as statistics [Puk06], convex geometry [Kha96], fair allocations [AGSS16], combinatorics [AGV18], spectral graph theory [NST19a], network design, and random processes [KT12]. In an instance of a determinant maximization problem, we are given a collection of vectors U = {v1, . . . , vn} β Rd , and a goal is to pick a subset S β U of given vectors to maximize the determinant of the matrix βiβS vivi^T. Often, the set S of picked vectors must satisfy additional combinatorial constraints such as cardinality constraint (S β€ k) or matroid constraint (S is a basis of a matroid defined on the vectors). In this paper, we give a polynomialtime deterministic algorithm that returns a r O(r)approximation for any matroid of rank r β€ d. This improves previous results that give e O(r^2)approximation algorithms relying on e^O(r)approximate estimation algorithms [NS16, AG17,AGV18, MNST20] for any r β€ d. All previous results use convex relaxations and their relationship to stable polynomials and strongly logconcave polynomials. In contrast, our algorithm builds on combinatorial algorithms for matroid intersection, which iteratively improve any solution by finding an alternating negative cycle in the exchange graph defined by the matroids. While the det(.) function is not linear, we show that taking appropriate linear approximations at each iteration suffice to give the improved approximation algorithm.more » « less

null (Ed.)We develop exact representations of training twolayer neural networks with rectified linear units (ReLUs) in terms of a single convex program with number of variables polynomial in the number of training samples and the number of hidden neurons. Our theory utilizes semiinfinite duality and minimum norm regularization. We show that ReLU networks trained with standard weight decay are equivalent to block `1 penalized convex models. Moreover, we show that certain standard convolutional linear networks are equivalent semidefinite programs which can be simplified to `1 regularized linear models in a polynomial sized discrete Fourier feature space.more » « less

We show that there are no spurious local minima in the nonconvex factorized parametrization of lowrank matrix recovery from incoherent linear measurements. With noisy measurements we show all local minima are very close to a global optimum. Together with a curvature bound at saddle points, this yields a polynomial time global convergence guarantee for stochastic gradient descent from random initialization.more » « less