skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Interpolating discriminant functions in high-dimensional Gaussian latent mixtures
Abstract This paper considers binary classification of high-dimensional features under a postulated model with a low-dimensional latent Gaussian mixture structure and nonvanishing noise. A generalized least-squares estimator is used to estimate the direction of the optimal separating hyperplane. The estimated hyperplane is shown to interpolate on the training data. While the direction vector can be consistently estimated, as could be expected from recent results in linear regression, a naive plug-in estimate fails to consistently estimate the intercept. A simple correction, which requires an independent hold-out sample, renders the procedure minimax optimal in many scenarios. The interpolation property of the latter procedure can be retained, but surprisingly depends on the way the labels are encoded.  more » « less
Award ID(s):
2210557
PAR ID:
10490549
Author(s) / Creator(s):
;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Biometrika
Volume:
111
Issue:
1
ISSN:
0006-3444
Format(s):
Medium: X Size: p. 291-308
Size(s):
p. 291-308
Sponsoring Org:
National Science Foundation
More Like this
  1. Many real time series datasets exhibit structural changes over time. A popular model for capturing their temporal dependence is that of vector autoregressions (VAR), which can accommodate structural changes through time evolving transition matrices. The problem then becomes to both estimate the (unknown) number of structural break points, together with the VAR model parameters. An additional challenge emerges in the presence of very large datasets, namely on how to accomplish these two objectives in a computational efficient manner. In this article, we propose a novel procedure which leverages a block segmentation scheme (BSS) that reduces the number of model parameters to be estimated through a regularized least-square criterion. Specifically, BSS examines appropriately defined blocks of the available data, which when combined with a fused lasso-based estimation criterion, leads to significant computational gains without compromising on the statistical accuracy in identifying the number and location of the structural breaks. This procedure is further coupled with new local and exhaustive search steps to consistently estimate the number and relative location of the break points. The procedure is scalable to big high-dimensional time series datasets with a computational complexity that can achieve O(n), where n is the length of the time series (sample size), compared to an exhaustive procedure that requires steps. Extensive numerical work on synthetic data supports the theoretical findings and illustrates the attractive properties of the procedure. Finally, an application to a neuroscience dataset exhibits its usefulness in applications. Supplementary files for this article are available online. 
    more » « less
  2. Abstract Modeling the complex interactions of systems of particles or agents is a fundamental problem across the sciences, from physics and biology, to economics and social sciences. In this work, we consider second-order, heterogeneous, multivariable models of interacting agents or particles, within simple environments. We describe a nonparametric inference framework to efficiently estimate the latent interaction kernels which drive these dynamical systems. We develop a learning theory which establishes strong consistency and optimal nonparametric min–max rates of convergence for the estimators, as well as provably accurate predicted trajectories. The optimal rates only depends on intrinsic dimension of interactions, which is typically much smaller than the ambient dimension. Our arguments are based on a coercivity condition which ensures that the interaction kernels can be estimated in stable fashion. The numerical algorithm presented to build the estimators is parallelizable, performs well on high-dimensional problems, and its performance is tested on a variety of complex dynamical systems. 
    more » « less
  3. Fan, J; Pan, J. (Ed.)
    Testing whether the mean vector from some population is zero or not is a fundamental problem in statistics. In the high-dimensional regime, where the dimension of data p is greater than the sample size n, traditional methods such as Hotelling’s T2 test cannot be directly applied. One can project the high-dimensional vector onto a space of low dimension and then traditional methods can be applied. In this paper, we propose a projection test based on a new estimation of the optimal projection direction Σ^{−1}μ. Under the assumption that the optimal projection Σ^{−1}μ is sparse, we use a regularized quadratic programming with nonconvex penalty and linear constraint to estimate it. Simulation studies and real data analysis are conducted to examine the finite sample performance of different tests in terms of type I error and power. 
    more » « less
  4. Abstract We consider high‐dimensional inference for potentially misspecified Cox proportional hazard models based on low‐dimensional results by Lin and Wei (1989). A desparsified Lasso estimator is proposed based on the log partial likelihood function and shown to converge to a pseudo‐true parameter vector. Interestingly, the sparsity of the true parameter can be inferred from that of the above limiting parameter. Moreover, each component of the above (nonsparse) estimator is shown to be asymptotically normal with a variance that can be consistently estimated even under model misspecifications. In some cases, this asymptotic distribution leads to valid statistical inference procedures, whose empirical performances are illustrated through numerical examples. 
    more » « less
  5. We study the structure of the set of all possible affine hyperplane sections of a convex polytope. We present two different cell decompositions of this set, induced by hyperplane arrangements. Using our decomposition, we bound the number of possible combinatorial types of sections and craft algorithms that compute optimal sections of the polytope according to various combinatorial and metric criteria, including sections that maximize the number of -dimensional faces, maximize the volume, and maximize the integral of a polynomial. Our optimization algorithms run in polynomial time in fixed dimension, but the same problems show computational complexity hardness otherwise. Our tools can be extended to intersection with halfspaces and projections onto hyperplanes. Finally, we present several experiments illustrating our theorems and algorithms on famous polytopes. 
    more » « less