Twosample tests are important areas aiming to determine whether two collections of observations follow the same distribution or not. We propose twosample tests based on integral probability metric (IPM) for highdimensional samples supported on a lowdimensional manifold. We characterize the properties of proposed tests with respect to the number of samples $n$ and the structure of the manifold with intrinsic dimension $d$. When an atlas is given, we propose a twostep test to identify the difference between general distributions, which achieves the typeII risk in the order of $n^{1/\max \{d,2\}}$. When an atlas is not given, we propose Hölder IPM test that applies for data distributions with $(s,\beta )$Hölder densities, which achieves the typeII risk in the order of $n^{(s+\beta )/d}$. To mitigate the heavy computation burden of evaluating the Hölder IPM, we approximate the Hölder function class using neural networks. Based on the approximation theory of neural networks, we show that the neural network IPM test has the typeII risk in the order of $n^{(s+\beta )/d}$, which is in the same order of the typeII risk as the Hölder IPM test. Our proposed tests are adaptive to lowdimensional geometric structure because their performance crucially depends on the intrinsic dimension instead of the data dimension.
Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to nonfederal websites. Their policies may differ from this site.

Abstract 
Free, publiclyaccessible full text available May 1, 2024

Deep generative models have experienced great empirical successes in distribution learning. Many existing experiments have demonstrated that deep generative networks can efficiently generate highdimensional complex data from a lowdimensional easytosample distribution. However, this phenomenon can not be justified by existing theories. The widely held manifold hypothesis speculates that realworld data sets, such as natural images and signals, exhibit lowdimensional geometric structures. In this paper, we take such lowdimensional data structures into consideration by assuming that data distributions are supported on a lowdimensional manifold. We prove approximation and estimation theories of deep generative networks for estimating distributions on a lowdimensional manifold under the Wasserstein1 loss. We show that the Wasserstein1 loss converges to zero at a fast rate depending on the intrinsic dimension instead of the ambient data dimension. Our theory leverages the lowdimensional geometric structures in data sets and justifies the practical power of deep generative models. We require no smoothness assumptions on the data distribution which is desirable in practice.more » « less

null (Ed.)
We consider the regression problem of estimating functions on $ \mathbb{R}^D $ but supported on a $ d $dimensional manifold $ \mathcal{M} ~~\subset \mathbb{R}^D $ with $ d \ll D $. Drawing ideas from multiresolution analysis and nonlinear approximation, we construct lowdimensional coordinates on $ \mathcal{M} $ at multiple scales, and perform multiscale regression by local polynomial fitting. We propose a datadriven wavelet thresholding scheme that automatically adapts to the unknown regularity of the function, allowing for efficient estimation of functions exhibiting nonuniform regularity at different locations and scales. We analyze the generalization error of our method by proving finite sample bounds in high probability on rich classes of priors. Our estimator attains optimal learning rates (up to logarithmic factors) as if the function was defined on a known Euclidean domain of dimension $ d $, instead of an unknown manifold embedded in $ \mathbb{R}^D $. The implemented algorithm has quasilinear complexity in the sample size, with constants linear in $ D $ and exponential in $ d $. Our work therefore establishes a new framework for regression on lowdimensional sets embedded in high dimensions, with fast implementation and strong theoretical guarantees.

Overparameterized neural networks enjoy great representation power on complex data, and more importantly yield sufficiently smooth output, which is crucial to their generalization and robustness. Most existing function approximation theories suggest that with sufficiently many parameters, neural networks can well approximate certain classes of functions in terms of the function value. The neural network themselves, however, can be highly nonsmooth. To bridge this gap, we take convolutional residual networks (ConvResNets) as an example, and prove that large ConvResNets can not only approximate a target function in terms of function value, but also exhibit sufficient firstorder smoothness. Moreover, we extend our theory to approximating functions supported on a lowdimensional manifold. Our theory partially justifies the benefits of using deep and wide networks in practice. Numerical experiments on adversarial robust image classification are provided to support our theory.more » « less

Deep neural networks have revolutionized many real world applications, due to their flexibility in data fitting and accurate predictions for unseen data. A line of research reveals that neural networks can approximate certain classes of functions with an arbitrary accuracy, while the size of the network scales exponentially with respect to the data dimension. Empirical results, however, suggest that networks of moderate size already yield appealing performance. To explain such a gap, a common belief is that many data sets exhibit low dimensional structures, and can be modeled as samples near a low dimensional manifold. In this paper, we prove that neural networks can efficiently approximate functions supported on low dimensional manifolds. The network size scales exponentially in the approximation error, with an exponent depending on the intrinsic dimension of the data and the smoothness of the function. Our result shows that exploiting low dimensional data structures can greatly enhance the efficiency in function approximation by neural networks. We also implement a subnetwork that assigns input data to their corresponding local neighborhoods, which may be of independent interest.more » « less