 Award ID(s):
 1818751
 NSFPAR ID:
 10296154
 Date Published:
 Journal Name:
 International Conference on Machine Learning
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this

Deep generative models have experienced great empirical successes in distribution learning. Many existing experiments have demonstrated that deep generative networks can efficiently generate highdimensional complex data from a lowdimensional easytosample distribution. However, this phenomenon can not be justified by existing theories. The widely held manifold hypothesis speculates that realworld data sets, such as natural images and signals, exhibit lowdimensional geometric structures. In this paper, we take such lowdimensional data structures into consideration by assuming that data distributions are supported on a lowdimensional manifold. We prove approximation and estimation theories of deep generative networks for estimating distributions on a lowdimensional manifold under the Wasserstein1 loss. We show that the Wasserstein1 loss converges to zero at a fast rate depending on the intrinsic dimension instead of the ambient data dimension. Our theory leverages the lowdimensional geometric structures in data sets and justifies the practical power of deep generative models. We require no smoothness assumptions on the data distribution which is desirable in practice.more » « less

Deep neural networks have revolutionized many real world applications, due to their flexibility in data fitting and accurate predictions for unseen data. A line of research reveals that neural networks can approximate certain classes of functions with an arbitrary accuracy, while the size of the network scales exponentially with respect to the data dimension. Empirical results, however, suggest that networks of moderate size already yield appealing performance. To explain such a gap, a common belief is that many data sets exhibit low dimensional structures, and can be modeled as samples near a low dimensional manifold. In this paper, we prove that neural networks can efficiently approximate functions supported on low dimensional manifolds. The network size scales exponentially in the approximation error, with an exponent depending on the intrinsic dimension of the data and the smoothness of the function. Our result shows that exploiting low dimensional data structures can greatly enhance the efficiency in function approximation by neural networks. We also implement a subnetwork that assigns input data to their corresponding local neighborhoods, which may be of independent interest.more » « less

Deep neural networks have revolutionized many real world applications, due to their flexibility in data fitting and accurate predictions for unseen data. A line of research reveals that neural networks can approximate certain classes of functions with an arbitrary accuracy, while the size of the network scales exponentially with respect to the data dimension. Empirical results, however, suggest that networks of moderate size already yield appealing performance. To explain such a gap, a common belief is that many data sets exhibit low dimensional structures, and can be modeled as samples near a low dimensional manifold. In this paper, we prove that neural networks can efficiently approximate functions supported on low dimensional manifolds. The network size scales exponentially in the approximation error, with an exponent depending on the intrinsic dimension of the data and the smoothness of the function. Our result shows that exploiting low dimensional data structures can greatly enhance the efficiency in function approximation by neural networks. We also implement a subnetwork that assigns input data to their corresponding local neighborhoods, which may be of independent interest.more » « less

Deep neural networks have revolutionized many real world applications, due to their flexibility in data fitting and accurate predictions for unseen data. A line of research reveals that neural networks can approximate certain classes of functions with an arbitrary accuracy, while the size of the network scales exponentially with respect to the data dimension. Empirical results, however, suggest that networks of moderate size already yield appealing performance. To explain such a gap, a common belief is that many data sets exhibit low dimensional structures, and can be modeled as samples near a low dimensional manifold. In this paper, we prove that neural networks can efficiently approximate functions supported on low dimensional manifolds. The network size scales exponentially in the approximation error, with an exponent depending on the intrinsic dimension of the data and the smoothness of the function. Our result shows that exploiting low dimensional data structures can greatly enhance the efficiency in function approximation by neural networks. We also implement a subnetwork that assigns input data to their corresponding local neighborhoods, which may be of independent interest.more » « less

Abstract Twosample tests are important areas aiming to determine whether two collections of observations follow the same distribution or not. We propose twosample tests based on integral probability metric (IPM) for highdimensional samples supported on a lowdimensional manifold. We characterize the properties of proposed tests with respect to the number of samples $n$ and the structure of the manifold with intrinsic dimension $d$. When an atlas is given, we propose a twostep test to identify the difference between general distributions, which achieves the typeII risk in the order of $n^{1/\max \{d,2\}}$. When an atlas is not given, we propose Hölder IPM test that applies for data distributions with $(s,\beta )$Hölder densities, which achieves the typeII risk in the order of $n^{(s+\beta )/d}$. To mitigate the heavy computation burden of evaluating the Hölder IPM, we approximate the Hölder function class using neural networks. Based on the approximation theory of neural networks, we show that the neural network IPM test has the typeII risk in the order of $n^{(s+\beta )/d}$, which is in the same order of the typeII risk as the Hölder IPM test. Our proposed tests are adaptive to lowdimensional geometric structure because their performance crucially depends on the intrinsic dimension instead of the data dimension.