Abstract While researchers commonly use the bootstrap to quantify the uncertainty of an estimator, it has been noticed that the standard bootstrap, in general, does not work for Chatterjee’s rank correlation. In this paper, we provide proof of this issue under an additional independence assumption, and complement our theory with simulation evidence for general settings. Chatterjee’s rank correlation thus falls into a category of statistics that are asymptotically normal, but bootstrap inconsistent. Valid inferential methods in this case are Chatterjee’s original proposal for testing independence and the analytic asymptotic variance estimator of Lin & Han (2022) for more general purposes. [Received on 5 April 2023. Editorial decision on 10 January 2024]
more »
« less
Bootstrapping Persistent Betti Numbers and Other Stabilizing Statistics
We investigate multivariate bootstrap procedures for general stabilizing statistics, with specific application to topological data analysis. The work relates to other general results in the area of stabilizing statistics, including central limit theorems for geometric and topological functionals of Poisson and binomial processes in the critical regime, where limit theorems prove difficult to use in practice, motivating the use of a bootstrap approach. A smoothed bootstrap procedure is shown to give consistent estimation in these settings. Specific statistics considered include the persistent Betti numbers of Čech and Vietoris–Rips complexes over point sets in Rd, along with Euler characteristics, and the total edge length of the k-nearest neighbor graph. Special emphasis is given to weakening the necessary conditions needed to establish bootstrap consistency. In particular, the assumption of a continuous underlying density is not required. Numerical studies illustrate the performance of the proposed method.
more »
« less
- Award ID(s):
- 2015575
- PAR ID:
- 10560618
- Publisher / Repository:
- https://projecteuclid.org/
- Date Published:
- Journal Name:
- The Annals of Statistics
- ISSN:
- 2168-8966
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Triangular systems with nonadditively separable unobserved heterogeneity provide a theoretically appealing framework for the modeling of complex structural relationships. However, they are not commonly used in practice due to the need for exogenous variables with large support for identification, the curse of dimensionality in estimation, and the lack of inferential tools. This paper introduces two classes of semiparametric nonseparable triangular models that address these limitations. They are based on distribution and quantile regression modeling of the reduced form conditional distributions of the endogenous variables. We show that average, distribution, and quantile structural functions are identified in these systems through a control function approach that does not require a large support condition. We propose a computationally attractive three‐stage procedure to estimate the structural functions where the first two stages consist of quantile or distribution regressions. We provide asymptotic theory and uniform inference methods for each stage. In particular, we derive functional central limit theorems and bootstrap functional central limit theorems for the distribution regression estimators of the structural functions. These results establish the validity of the bootstrap for three‐stage estimators of structural functions, and lead to simple inference algorithms. We illustrate the implementation and applicability of all our methods with numerical simulations and an empirical application to demand analysis.more » « less
-
A<sc>bstract</sc> Gauging is a powerful operation on symmetries in quantum field theory (QFT), as it connects distinct theories and also reveals hidden structures in a given theory. We initiate a systematic investigation of gauging discrete generalized symmetries in two-dimensional QFT. Such symmetries are described by topological defect lines (TDLs) which obey fusion rules that are non-invertible in general. Despite this seemingly exotic feature, all well-known properties in gauging invertible symmetries carry over to this general setting, which greatly enhances both the scope and the power of gauging. This is established by formulating generalized gauging in terms of topological interfaces between QFTs, which explains the physical picture for the mathematical concept of algebra objects and associated module categories over fusion categories that encapsulate the algebraic properties of generalized symmetries and their gaugings. This perspective also provides simple physical derivations of well-known mathematical theorems in category theory from basic axiomatic properties of QFT in the presence of such interfaces. We discuss a bootstrap-type analysis to classify such topological interfaces and thus the possible generalized gaugings and demonstrate the procedure in concrete examples of fusion categories. Moreover we present a number of examples to illustrate generalized gauging and its properties in concrete conformal field theories (CFTs). In particular, we identify the generalized orbifold groupoid that captures the structure of fusion between topological interfaces (equivalently sequential gaugings) as well as a plethora of new self-dualities in CFTs under generalized gaugings.more » « less
-
This paper establishes central limit theorems (CLTs) and proposes how to perform valid inference in factor models. We consider a setting where many counties/regions/assets are observed for many time periods, and when estimation of a global parameter includes aggregation of a cross-section of heterogeneous microparameters estimated separately for each entity. The CLT applies for quantities involving both cross-sectional and time series aggregation, as well as for quadratic forms in time-aggregated errors. This paper studies the conditions when one can consistently estimate the asymptotic variance, and proposes a bootstrap scheme for cases when one cannot. A small simulation study illustrates performance of the asymptotic and bootstrap procedures. The results are useful for making inferences in two-step estimation procedures related to factor models, as well as in other related contexts. Our treatment avoids structural modeling of cross-sectional dependence but imposes time-series independence.more » « less
-
Abstract Optimal transport (OT) is a versatile framework for comparing probability measures, with many applications to statistics, machine learning and applied mathematics. However, OT distances suffer from computational and statistical scalability issues to high dimensions, which motivated the study of regularized OT methods like slicing, smoothing and entropic penalty. This work establishes a unified framework for deriving limit distributions of empirical regularized OT distances, semiparametric efficiency of the plug-in empirical estimator and bootstrap consistency. We apply the unified framework to provide a comprehensive statistical treatment of (i) average- and max-sliced $$p$$-Wasserstein distances, for which several gaps in existing literature are closed; (ii) smooth distances with compactly supported kernels, the analysis of which is motivated by computational considerations; and (iii) entropic OT, for which our method generalizes existing limit distribution results and establishes, for the first time, efficiency and bootstrap consistency. While our focus is on these three regularized OT distances as applications, the flexibility of the proposed framework renders it applicable to broad classes of functionals beyond these examples.more » « less
An official website of the United States government

