NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Rates of bootstrap approximation for eigenvalues in high-dimensional PCA

https://doi.org/10.5705/ss.202021.0158

Yao, J.; Lopes, M. E. (May 2023, Statistica Sinica)

In the context of principal components analysis (PCA), the bootstrap is commonly applied to solve a variety of inference problems, such as constructing confidence intervals for the eigenvalues of the population covariance matrix Σ. However, when the data are high-dimensional, there are relatively few theoretical guarantees that quantify the performance of the bootstrap. Our aim in this paper is to analyze how well the bootstrap can approximate the joint distribution of the leading eigenvalues of the sample covariance matrix \hat{Σ}, and we establish non-asymptotic rates of approximation with respect to the multivariate Kolmogorov metric. Under certain assumptions, we show that the bootstrap can achieve a dimension-free rate of r(Σ)/sqrt{n} up to logarithmic factors, where r(Σ) is the effective rank of Σ, and n is the sample size. From a methodological standpoint, we show that applying a transformation to the eigenvalues of \hat{Σ} before bootstrapping is an important consideration in high-dimensional settings.
more » « less
Full Text Available
Error Estimation for Random Fourier Features

Yao, J.; Erichson, N. B.; Lopes, M. E. (April 2023, Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR)
Ruiz, F.; Dy, J.; van de Meent, J.-W. (Ed.)
Random Fourier Features (RFF) is among the most popular and broadly applicable approaches for scaling up kernel methods. In essence, RFF allows the user to avoid costly computations with a large kernel matrix via a fast randomized approximation. However, a pervasive difficulty in applying RFF is that the user does not know the actual error of the approximation, or how this error will propagate into downstream learning tasks. Up to now, the RFF literature has primarily dealt with these uncertainties using theoretical error bounds, but from a user’s standpoint, such results are typically impractical—either because they are highly conservative or involve unknown quantities. To tackle these general issues in a data-driven way, this paper develops a bootstrap approach to numerically estimate the errors of RFF approximations. Three key advantages of this approach are: (1) The error estimates are specific to the problem at hand, avoiding the pessimism of worst-case bounds. (2) The approach is flexible with respect to different uses of RFF, and can even estimate errors in downstream learning tasks. (3) The approach enables adaptive computation, in the sense that the user can quickly inspect the error of a rough initial kernel approximation and then predict how much extra work is needed. Furthermore, in exchange for all of these benefits, the error estimates can be obtained at a modest computational cost.
more » « less
Bootstrapping the operator norm in high dimensions: Error estimation for covariance matrices and sketching

Lopes, M. E. (February 2023, Bernoulli)

Full Text Available
Bootstrapping the operator norm in high dimensions: Error estimation for covariance matrices and sketching.

Lopes, M. E. (February 2023, Bernoulli)

Full Text Available
A bootstrap method for spectral statistics in high-dimensional elliptical models

https://doi.org/10.1214/23-EJS2140

Wang, Siyao; Lopes, Miles E. (January 2023, Electronic Journal of Statistics)

Full Text Available
Central limit theorem and bootstrap approximation in high dimensions: Near 1/sqrt{n} rates via implicit smoothing

https://doi.org/10.1214/22-AOS2184

Lopes, M. E. (October 2022, Annals of Statistics)

Nonasymptotic bounds for Gaussian and bootstrap approximation have recently attracted significant interest in high-dimensional statistics. This paper studies Berry–Esseen bounds for such approximations with respect to the multivariate Kolmogorov distance, in the context of a sum of n random vectors that are p-dimensional and i.i.d. Up to now, a growing line of work has established bounds with mild logarithmic dependence on p. However, the problem of developing corresponding bounds with near n^{−1/2} dependence on n has remained largely unresolved. Within the setting of random vectors that have sub-Gaussian or subexponential entries, this paper establishes bounds with near n^{−1/2} dependence, for both Gaussian and bootstrap approximation. In addition, the proofs are considerably distinct from other recent approaches, and make use of an “implicit smoothing” operation in the Lindeberg interpolation.
more » « less
Full Text Available
Central limit theorem and bootstrap approximation in high dimensions: Near 1/sqrt{n} rates via implicit smoothing

https://doi.org/10.1214/22-AOS2184

Lopes, M. E. (October 2022, Annals of Statistics)

Nonasymptotic bounds for Gaussian and bootstrap approximation have recently attracted significant interest in high-dimensional statistics. This pa- per studies Berry–Esseen bounds for such approximations with respect to the multivariate Kolmogorov distance, in the context of a sum of n random vectors that are p-dimensional and i.i.d. Up to now, a growing line of work has established bounds with mild logarithmic dependence on p. However, the problem of developing corresponding bounds with near n^{−1/2} dependence on n has remained largely unresolved. Within the setting of random vectors that have sub-Gaussian or subexponential entries, this paper establishes bounds with near n^{−1/2} dependence, for both Gaussian and bootstrap approximation. In addition, the proofs are considerably distinct from other recent approaches, and make use of an “implicit smoothing” operation in the Lindeberg interpolation.
more » « less
Full Text Available
A sharp lower-tail bound for Gaussian maxima with application to bootstrap methods in high dimensions

https://doi.org/10.1214/21-EJS1961

Lopes, Miles E.; Yao, Junwen (January 2022, Electronic Journal of Statistics)

Full Text Available
High-Dimensional MANOVA Via Bootstrapping and Its Application to Functional and Sparse Count Data

https://doi.org/10.1080/01621459.2021.1920959

Lin, Zhenhua; Lopes, Miles E.; Müller, Hans-Georg (April 2021, Journal of the American Statistical Association)
null (Ed.)
Full Text Available
Estimating the Error of Randomized Newton Methods: A Bootstrap Approach

Chen, Xiaotie; Lopes, Miles E. (July 2020, International Conference on Machine Learning)

Full Text Available

« Prev Next »

Search for: All records