Triangular systems with nonadditively separable unobserved heterogeneity provide a theoretically appealing framework for the modeling of complex structural relationships. However, they are not commonly used in practice due to the need for exogenous variables with large support for identification, the curse of dimensionality in estimation, and the lack of inferential tools. This paper introduces two classes of semiparametric nonseparable triangular models that address these limitations. They are based on distribution and quantile regression modeling of the reduced form conditional distributions of the endogenous variables. We show that average, distribution, and quantile structural functions are identified in these systems through a control function approach that does not require a large support condition. We propose a computationally attractive three‐stage procedure to estimate the structural functions where the first two stages consist of quantile or distribution regressions. We provide asymptotic theory and uniform inference methods for each stage. In particular, we derive functional central limit theorems and bootstrap functional central limit theorems for the distribution regression estimators of the structural functions. These results establish the validity of the bootstrap for three‐stage estimators of structural functions, and lead to simple inference algorithms. We illustrate the implementation and applicability of all our methods with numerical simulations and an empirical application to demand analysis.
- Award ID(s):
- NSF-PAR ID:
- Date Published:
- Journal Name:
- Quantifying the Empirical Wasserstein Distance to a Set of Measures: Beating the Curse of Dimensionality
- Page Range / eLocation ID:
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
Statistical inference can be performed by minimizing, over the parameter space, the Wasserstein distance between model distributions and the empirical distribution of the data. We study asymptotic properties of such minimum Wasserstein distance estimators, complementing results derived by Bassetti, Bodini and Regazzini in 2006. In particular, our results cover the misspecified setting, in which the data-generating process is not assumed to be part of the family of distributions described by the model. Our results are motivated by recent applications of minimum Wasserstein estimators to complex generative models. We discuss some difficulties arising in the numerical approximation of these estimators. Two of our numerical examples ($g$-and-$\kappa$ and sum of log-normals) are taken from the literature on approximate Bayesian computation and have likelihood functions that are not analytically tractable. Two other examples involve misspecified models.
Supervised dimensionality reduction for sequence data projects the observations in sequences onto a low-dimensional subspace to better separate different sequence classes. It is typically more challenging than conventional dimensionality reduction for static data, because measuring the separability of sequences involves non-linear procedures to manipulate the temporal structures. This paper presents a linear method, namely Order-preserving Wasserstein Discriminant Analysis (OWDA), which learns the projection by maximizing the inter-class distance and minimizing the intra-class scatter. For each class, OWDA extracts the order-preserving Wasserstein barycenter and constructs the intra-class scatter as the dispersion of the training sequences around the barycenter. The inter-class distance is measured as the order-preserving Wasserstein distance between the corresponding barycenters. OWDA is able to concentrate on the distinctive differences among classes by lifting the geometric relations with temporal constraints. Experiments show that OWDA achieves competitive results on three 3D action recognition datasets.more » « less
null (Ed.)Supervised dimensionality reduction for sequence data learns a transformation that maps the observations in sequences onto a low-dimensional subspace by maximizing the separability of sequences in different classes. It is typically more challenging than conventional dimensionality reduction for static data, because measuring the separability of sequences involves non-linear procedures to manipulate the temporal structures. In this paper, we propose a linear method, called Order-preserving Wasserstein Discriminant Analysis (OWDA), and its deep extension, namely DeepOWDA, to learn linear and non-linear discriminative subspace for sequence data, respectively. We construct novel separability measures between sequence classes based on the order-preserving Wasserstein (OPW) distance to capture the essential differences among their temporal structures. Specifically, for each class, we extract the OPW barycenter and construct the intra-class scatter as the dispersion of the training sequences around the barycenter. The inter-class distance is measured as the OPW distance between the corresponding barycenters. We learn the linear and non-linear transformations by maximizing the inter-class distance and minimizing the intra-class scatter. In this way, the proposed OWDA and DeepOWDA are able to concentrate on the distinctive differences among classes by lifting the geometric relations with temporal constraints. Experiments on four 3D action recognition datasets show the effectiveness of OWDA and DeepOWDA.more » « less
Optimal transport (OT) methods seek a transformation map (or plan) between two probability measures, such that the transformation has the minimum transportation cost. Such a minimum transport cost, with a certain power transform, is called the Wasserstein distance. Recently, OT methods have drawn great attention in statistics, machine learning, and computer science, especially in deep generative neural networks. Despite its broad applications, the estimation of high‐dimensional Wasserstein distances is a well‐known challenging problem owing to the curse‐of‐dimensionality. There are some cutting‐edge projection‐based techniques that tackle high‐dimensional OT problems. Three major approaches of such techniques are introduced, respectively, the slicing approach, the iterative projection approach, and the projection robust OT approach. Open challenges are discussed at the end of the review.
This article is categorized under:
Statistical and Graphical Methods of Data Analysis > Dimension Reduction
Statistical Learning and Exploratory Methods of the Data Sciences > Manifold Learning