skip to main content


Title: The geometric median and applications to robust mean estimation
This paper is devoted to the statistical properties of the geometric median, a robust measure of centrality for multivariate data, as well as its applications to the problem of mean estimation via the median of means principle. Our main theoretical results include (a) the upper bound for the distance between the mean and the median for general absolutely continuous distributions in $\mathbb R^d$, and examples of specific classes of distributions for which these bounds do not depend on the ambient dimension $d$; (b) exponential deviation inequalities for the distance between the sample and the population versions of the geometric median, which again depend only on the trace-type quantities and not on the ambient dimension. As a corollary, we deduce the improved bounds for the multivariate median of means estimator that hold for large classes of heavy-tailed distributions.  more » « less
Award ID(s):
2045068 1908905
NSF-PAR ID:
10423179
Author(s) / Creator(s):
;
Date Published:
Journal Name:
arXivorg
ISSN:
2331-8422
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract High Ice Water Content (HIWC) regions above tropical mesoscale convective systems are investigated using data from the second collaboration of the High Altitude Ice Crystals and High Ice Water Content projects (HAIC-HIWC) based in Cayenne, French Guiana in 2015. Observations from in-situ cloud probes on the French Falcon 20 determine the microphysical and thermodynamic properties of such regions. Data from a 2-D stereo probe and precipitation imaging probe show how statistical distributions of ice crystal mass median diameter ( MMD ), ice water content ( IWC ), and total number concentration ( N t ) for particles with maximum dimension ( D max ) > 55 μm vary with environmental conditions, temperature ( T ), and convective properties such as vertical velocity ( w ), MCS age, distance away from convective peak ( L ), and surface characteristics. IWC is significantly correlated with w , whereas MMD decreases and N t increases with decreasing T consistent with aggregation, sedimentation and vapor deposition processes at lower altitudes. MMD typically increases with IWC when IWC < 0.5 g m -3 , but decreases with IWC when IWC > 0.5 g m -3 for -15 °C ≤ T ≤ -5 °C. Trends also depend on environmental conditions, such as presence of convective updrafts that are the ice crystal source, MMD being larger in older MCSs consistent with aggregation and less injection of small crystals into anvils, and IWC s decrease with increasing L at lower T . The relationship between IWC and MMD depends on environmental conditions, with correlations decreasing with decreasing T . The strength of correlation between IWC and N t increases as T decreases. 
    more » « less
  2. Abstract

    We study nonparametric maximum likelihood estimation for two classes of multivariate distributions that imply strong forms of positive dependence; namely log‐supermodular (MTP2) distributions andlogLconcave(LLC) distributions. In both cases we also assume log‐concavity in order to ensure boundedness of the likelihood function. Givennindependent and identically distributed random vectors from one of our distributions, the maximum likelihood estimator (MLE) exists a.s. and is unique a.e. with probability one whenn≥3. This holds independently of the ambient dimensiond. We conjecture that the MLE is always the exponential of a tent function. We prove this result for samples in {0,1}dor in under MTP2, and for samples in under LLC. Finally, we provide a conditional gradient algorithm for computing the maximum likelihood estimate.

     
    more » « less
  3. It is currently known how to characterize functions that neural networks can learn with SGD for two extremal parametrizations: neural networks in the linear regime, and neural networks with no structural constraints. However, for the main parametrization of interest —non-linear but regular networks— no tight characterization has yet been achieved, despite significant developments. We take a step in this direction by considering depth-2 neural networks trained by SGD in the mean-field regime. We consider functions on binary inputs that depend on a latent low-dimensional subspace (i.e., small number of coordinates). This regime is of interest since it is poorly under- stood how neural networks routinely tackle high-dimensional datasets and adapt to latent low- dimensional structure without suffering from the curse of dimensionality. Accordingly, we study SGD-learnability with O(d) sample complexity in a large ambient dimension d. Our main results characterize a hierarchical property —the merged-staircase property— that is both necessary and nearly sufficient for learning in this setting. We further show that non-linear training is necessary: for this class of functions, linear methods on any feature map (e.g., the NTK) are not capable of learning efficiently. The key tools are a new “dimension-free” dynamics approximation result that applies to functions defined on a latent space of low-dimension, a proof of global convergence based on polynomial identity testing, and an improvement of lower bounds against linear methods for non-almost orthogonal functions. 
    more » « less
  4. Abstract

    The quantification of Hutchinson's n‐dimensional hypervolume has enabled substantial progress in community ecology, species niche analysis and beyond. However, most existing methods do not support a partitioning of the different components of hypervolume. Such a partitioning is crucial to address the ‘curse of dimensionality’ in hypervolume measures and interpret the metrics on the original niche axes instead of principal components. Here, we propose the use of multivariate normal distributions for the comparison of niche hypervolumes and introduce this as the multivariate‐normal hypervolume (MVNH) framework (R package available onhttps://github.com/lvmuyang/MVNH).

    The framework provides parametric measures of the size and dissimilarity of niche hypervolumes, each of which can be partitioned into biologically interpretable components. Specifically, the determinant of the covariance matrix (i.e. the generalized variance) of a MVNH is a measure of total niche size, which can be partitioned into univariate niche variance components and a correlation component (a measure of dimensionality, i.e. the effective number of independent niche axes standardized by the number of dimensions). The Bhattacharyya distance (BD; a function of the geometric mean of two probability distributions) between two MVNHs is a measure of niche dissimilarity. The BD partitions total dissimilarity into the components of Mahalanobis distance (standardized Euclidean distance with correlated variables) between hypervolume centroids and the determinant ratio which measures hypervolume size difference. The Mahalanobis distance and determinant ratio can be further partitioned into univariate divergences and a correlation component.

    We use empirical examples of community‐ and species‐level analysis to demonstrate the new insights provided by these metrics. We show that the newly proposed framework enables us to quantify the relative contributions of different hypervolume components and to connect these analyses to the ecological drivers of functional diversity and environmental niche variation.

    Our approach overcomes several operational and computational limitations of popular nonparametric methods and provides a partitioning framework that has wide implications for understanding functional diversity, niche evolution, niche shifts and expansion during biotic invasions, etc.

     
    more » « less
  5. null (Ed.)
    Lightness and sparsity are two natural parameters for Euclidean (1+ε)-spanners. Classical results show that, when the dimension d ∈ ℕ and ε > 0 are constant, every set S of n points in d-space admits an (1+ε)-spanners with O(n) edges and weight proportional to that of the Euclidean MST of S. Tight bounds on the dependence on ε > 0 for constant d ∈ ℕ have been established only recently. Le and Solomon (FOCS 2019) showed that Steiner points can substantially improve the lightness and sparsity of a (1+ε)-spanner. They gave upper bounds of Õ(ε^{-(d+1)/2}) for the minimum lightness in dimensions d ≥ 3, and Õ(ε^{-(d-1))/2}) for the minimum sparsity in d-space for all d ≥ 1. They obtained lower bounds only in the plane (d = 2). Le and Solomon (ESA 2020) also constructed Steiner (1+ε)-spanners of lightness O(ε^{-1}logΔ) in the plane, where Δ ∈ Ω(log n) is the spread of S, defined as the ratio between the maximum and minimum distance between a pair of points. In this work, we improve several bounds on the lightness and sparsity of Euclidean Steiner (1+ε)-spanners. Using a new geometric analysis, we establish lower bounds of Ω(ε^{-d/2}) for the lightness and Ω(ε^{-(d-1)/2}) for the sparsity of such spanners in Euclidean d-space for all d ≥ 2. We use the geometric insight from our lower bound analysis to construct Steiner (1+ε)-spanners of lightness O(ε^{-1}log n) for n points in Euclidean plane. 
    more » « less