skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Adaptive Geometric Multiscale Approximations for Intrinsically Low-dimensional Data
We consider the problem of efficiently approximating and encoding high-dimensional data sampled from a probability distribution ρ in ℝD, that is nearly supported on a d-dimensional set  - for example supported on a d-dimensional manifold. Geometric Multi-Resolution Analysis (GMRA) provides a robust and computationally efficient procedure to construct low-dimensional geometric approximations of  at varying resolutions. We introduce GMRA approximations that adapt to the unknown regularity of , by introducing a thresholding algorithm on the geometric wavelet coefficients. We show that these data-driven, empirical geometric approximations perform well, when the threshold is chosen as a suitable universal function of the number of samples n, on a large class of measures ρ, that are allowed to exhibit different regularity at different scales and locations, thereby efficiently encoding data from more complex measures than those supported on manifolds. These GMRA approximations are associated to a dictionary, together with a fast transform mapping data to d-dimensional coefficients, and an inverse of such a map, all of which are data-driven. The algorithms for both the dictionary construction and the transforms have complexity CDnlogn with the constant C exponential in d. Our work therefore establishes Adaptive GMRA as a fast dictionary learning algorithm, with approximation guarantees, for intrinsically low-dimensional data. We include several numerical experiments on both synthetic and real data, confirming our theoretical results and demonstrating the effectiveness of Adaptive GMRA.  more » « less
Award ID(s):
1818751
PAR ID:
10148670
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Journal of machine learning research
Volume:
20
Issue:
98
ISSN:
1532-4435
Page Range / eLocation ID:
1-63
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    We consider the regression problem of estimating functions on $$ \mathbb{R}^D $$ but supported on a $ d $-dimensional manifold $$ \mathcal{M} ~~\subset \mathbb{R}^D $$ with $$ d \ll D $$. Drawing ideas from multi-resolution analysis and nonlinear approximation, we construct low-dimensional coordinates on $$ \mathcal{M} $$ at multiple scales, and perform multiscale regression by local polynomial fitting. We propose a data-driven wavelet thresholding scheme that automatically adapts to the unknown regularity of the function, allowing for efficient estimation of functions exhibiting nonuniform regularity at different locations and scales. We analyze the generalization error of our method by proving finite sample bounds in high probability on rich classes of priors. Our estimator attains optimal learning rates (up to logarithmic factors) as if the function was defined on a known Euclidean domain of dimension $ d $, instead of an unknown manifold embedded in $$ \mathbb{R}^D $$. The implemented algorithm has quasilinear complexity in the sample size, with constants linear in $ D $ and exponential in $ d $. Our work therefore establishes a new framework for regression on low-dimensional sets embedded in high dimensions, with fast implementation and strong theoretical guarantees. 
    more » « less
  2. We propose K-Deep Simplex (KDS) which, given a set of data points, learns a dictionary comprising synthetic landmarks, along with representation coefficients supported on a simplex. KDS employs a local weighted ℓ1 penalty that encourages each data point to represent itself as a convex combination of nearby landmarks. We solve the proposed optimization program using alternating minimization and design an efficient, interpretable autoencoder using algorithm unrolling. We theoretically analyze the proposed program by relating the weighted ℓ1 penalty in KDS to a weighted ℓ0 program. Assuming that the data are generated from a Delaunay triangulation, we prove the equivalence of the weighted ℓ1 and weighted ℓ0 programs. We further show the stability of the representation coefficients under mild geometrical assumptions. If the representation coefficients are fixed, we prove that the sub-problem of minimizing over the dictionary yields a unique solution. Further, we show that low-dimensional representations can be efficiently obtained from the covariance of the coefficient matrix. Experiments show that the algorithm is highly efficient and performs competitively on synthetic and real data sets. 
    more » « less
  3. We study two-stage stochastic optimization problems with random recourse, where the coefficients of the adaptive decisions involve uncertain parameters. To deal with the infinite-dimensional recourse decisions, we propose a scalable approximation scheme via piecewise linear and piecewise quadratic decision rules. We develop a data-driven distributionally robust framework with two layers of robustness to address distributional uncertainty. We also establish out-of-sample performance guarantees for the proposed scheme. Applying known ideas, the resulting optimization problem can be reformulated as an exact copositive program that admits semidefinite programming approximations. We design an iterative decomposition algorithm, which converges under some regularity conditions, to reduce the runtime needed to solve this program. Through numerical examples for various known operations management applications, we demonstrate that our method produces significantly better solutions than the traditional sample-average approximation scheme especially when the data are limited. For the problem instances for which only the recourse cost coefficients are random, our method exhibits slightly inferior out-of-sample performance but shorter runtimes compared with a competing approach. 
    more » « less
  4. Deep generative models have experienced great empirical successes in distribution learning. Many existing experiments have demonstrated that deep generative networks can efficiently generate high-dimensional complex data from a low-dimensional easy-to-sample distribution. However, this phenomenon can not be justified by existing theories. The widely held manifold hypothesis speculates that real-world data sets, such as natural images and signals, exhibit low-dimensional geometric structures. In this paper, we take such low-dimensional data structures into consideration by assuming that data distributions are supported on a low-dimensional manifold. We prove approximation and estimation theories of deep generative networks for estimating distributions on a low-dimensional manifold under the Wasserstein-1 loss. We show that the Wasserstein-1 loss converges to zero at a fast rate depending on the intrinsic dimension instead of the ambient data dimension. Our theory leverages the low-dimensional geometric structures in data sets and justifies the practical power of deep generative models. We require no smoothness assumptions on the data distribution which is desirable in practice. 
    more » « less
  5. Abstract Two-sample tests are important areas aiming to determine whether two collections of observations follow the same distribution or not. We propose two-sample tests based on integral probability metric (IPM) for high-dimensional samples supported on a low-dimensional manifold. We characterize the properties of proposed tests with respect to the number of samples $$n$$ and the structure of the manifold with intrinsic dimension $$d$$. When an atlas is given, we propose a two-step test to identify the difference between general distributions, which achieves the type-II risk in the order of $$n^{-1/\max \{d,2\}}$$. When an atlas is not given, we propose Hölder IPM test that applies for data distributions with $$(s,\beta )$$-Hölder densities, which achieves the type-II risk in the order of $$n^{-(s+\beta )/d}$$. To mitigate the heavy computation burden of evaluating the Hölder IPM, we approximate the Hölder function class using neural networks. Based on the approximation theory of neural networks, we show that the neural network IPM test has the type-II risk in the order of $$n^{-(s+\beta )/d}$$, which is in the same order of the type-II risk as the Hölder IPM test. Our proposed tests are adaptive to low-dimensional geometric structure because their performance crucially depends on the intrinsic dimension instead of the data dimension. 
    more » « less