skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on February 1, 2026

Title: Covariance loss, Szemeredi regularity, and differential privacy
We show how randomized rounding based on Grothendieck’s identity can be used to prove a nearly tight bound on the covariance loss–the amount of covariance that is lost by taking conditional expectation. This result yields a new type of weak Szemeredi regularity lemma for positive semidefinite matrices and kernels. Moreover, it can be used to construct differentially private synthetic data.  more » « less
Award ID(s):
1954233
PAR ID:
10617018
Author(s) / Creator(s):
; ;
Publisher / Repository:
AMS
Date Published:
Journal Name:
Proceedings of the American Mathematical Society
Volume:
153
Issue:
788
ISSN:
0002-9939
Page Range / eLocation ID:
773 to 782
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Data assimilation methods often also employ the same discrete dynamical model used to evolve the state estimate in time to propagate an approximation of the state estimation error covariance matrix. Four‐dimensional variational methods, for instance, evolve the covariance matrix implicitly via discrete tangent linear dynamics. Ensemble methods, while not forming this matrix explicitly, approximate its evolution at low rank from the evolution of the ensemble members. Such approximate evolution schemes for the covariance matrix imply an approximate evolution of the estimation error variances along its diagonal. For states that satisfy the advection equation, the continuity equation, or related hyperbolic partial differential equations (PDEs), the estimation error variance itself satisfies a known PDE, so the accuracy of the various approximations to the variances implied by the discrete covariance propagation can be determined directly. Experiments conducted by the atmospheric chemical constituent data assimilation community have indicated that such approximate variance evolution can be highly inaccurate. Through careful analysis and simple numerical experiments, we show that such poor accuracy must be expected, due to the inherent nature of discrete covariance propagation, coupled with a special property of the continuum covariance dynamics for states governed by these types of hyperbolic PDE. The intuitive explanation for this inaccuracy is that discrete covariance propagation involves approximating diagonal elements of the covariance matrix with combinations of off‐diagonal elements, even when there is a discontinuity in the continuum covariance dynamics along the diagonal. Our analysis uncovers the resulting error terms that depend on the ratio of the grid spacing to the correlation length, and these terms become very large when correlation lengths begin to approach the grid scale, for instance, as gradients steepen near the diagonal. We show that inaccurate variance evolution can manifest itself as both spurious loss and gain of variance. 
    more » « less
  2. For multivariate stationary time series many important properties, such as partial correlation, graphical models and autoregressive representations are encoded in the inverse of its spectral density matrix. This is not true for nonstationary time series, where the pertinent information lies in the inverse infinite dimensional covariance matrix operator associated with the multivariate time series. This necessitates the study of the covariance of a multivariate nonstationary time series and its relationship to its inverse. We show that if the rows/columns of the infinite dimensional covariance matrix decay at a certain rate then the rate (up to a factor) transfers to the rows/columns of the inverse covariance matrix. This is used to obtain a nonstationary autoregressive representation of the time series and a Baxter-type bound between the parameters of the autoregressive infinite representation and the corresponding finite autoregressive projection. The aforementioned results lay the foundation for the subsequent analysis of locally stationary time series. In particular, we show that smoothness properties on the covariance matrix transfer to (i) the inverse covariance (ii) the parameters of the vector autoregressive representation and (iii) the partial covariances. All results are set up in such a way that the constants involved depend only on the eigenvalue of the covariance matrix and can be applied in the high-dimensional settings with non-diverging eigenvalues. 
    more » « less
  3. The success of nonlinear optics relies largely on pulse-to-pulse consistency. In contrast, covariance-based techniques used in photoionization electron spectroscopy and mass spectrometry have shown that a wealth of information can be extracted from noise that is lost when averaging multiple measurements. Here, we apply covariance-based detection to nonlinear optical spectroscopy, and show that noise in a femtosecond laser is not necessarily a liability to be mitigated, but can act as a unique and powerful asset. As a proof of principle we apply this approach to the process of stimulated Raman scattering in α-quartz. Our results demonstrate how nonlinear processes in the sample can encode correlations between the spectral components of ultrashort pulses with uncorrelated stochastic fluctuations. This in turn provides richer information compared with the standard nonlinear optics techniques that are based on averages over many repetitions with well-behaved laser pulses. These proof-of-principle results suggest that covariance-based nonlinear spectroscopy will improve the applicability of fs nonlinear spectroscopy in wavelength ranges where stable, transform-limited pulses are not available, such as X-ray free-electron lasers which naturally have spectrally noisy pulses ideally suited for this approach. 
    more » « less
  4. Inferring graph structure from observations on the nodes is an important and popular network science task. Departing from the more common inference of a single graph, we study the problem of jointly inferring multiple graphs from the observation of signals at their nodes (graph signals), which are assumed to be stationary in the sought graphs. Graph stationarity implies that the mapping between the covariance of the signals and the sparse matrix representing the underlying graph is given by a matrix polynomial. A prominent example is that of Markov random fields, where the inverse of the covariance yields the sparse matrix of interest. From a modeling perspective, stationary graph signals can be used to model linear network processes evolving on a set of (not necessarily known) networks. Leveraging that matrix polynomials commute, a convex optimization method along with sufficient conditions that guarantee the recovery of the true graphs are provided when perfect covariance information is available. Particularly important from an empirical viewpoint, we provide high-probability bounds on the recovery error as a function of the number of signals observed and other key problem parameters. Numerical experiments demonstrate the effectiveness of the proposed method with perfect covariance information as well as its robustness in the noisy regime. 
    more » « less
  5. Lewandowski, H. (Ed.)
    Covariance mapping is widely used to study correlations of different variables in the dataset. The power of the method has been demonstrated in multi-particle imaging, including two- and three-body covariance on molecules of biological relevance and Coulomb explosion imaging (CEI) of molecular dissociation dynamics. While covariance for two particles is rather straightforward, for four-body correlations, one needs to extend covariance mapping to cumulant mapping, which has been tested in recent measurements of strong field ionization of formaldehyde. Here, I will discuss the details of how to compute cumulant mapping for the momentum sum of all four fragments of the formaldehyde molecule, and how one can perform the calculation with a faster and better algorithm. 
    more » « less