Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to nonfederal websites. Their policies may differ from this site.

In this work, we revisit the generalization error of stochastic mirror descent for quadratically bounded losses studied in Telgarsky (2022). Quadratically bounded losses is a broad class of loss functions, capturing both Lipschitz and smooth functions, for both regression and classification problems. We study the high probability generalization for this class of losses on linear predictors in both realizable and nonrealizable cases when the data are sampled IID or from a Markov chain. The prior work relies on an intricate coupling argument between the iterates of the original problem and those projected onto a bounded domain. This approach enables blackbox application of concentration inequalities, but also leads to suboptimal guarantees due in part to the use of a union bound across all iterations. In this work, we depart significantly from the prior work of Telgarsky (2022), and introduce a novel approach for establishing high probability generalization guarantees. In contrast to the prior work, our work directly analyzes the moment generating function of a novel supermartingale sequence and leverages the structure of stochastic mirror descent. As a result, we obtain improved bounds in all aforementioned settings. Specifically, in the realizable case and nonrealizable case with lighttailed subGaussian data, we improve the bounds by a $\log T$ factor, matching the correct rates of $1/T$ and $1/\sqrt{T}$, respectively. In the more challenging case of heavytailed polynomial data, we improve the existing bound by a $\mathrm{poly}\ T$ factor.more » « lessFree, publiclyaccessible full text available December 10, 2024

Abstract The free boundary problem for a two‐dimensional fluid permeating a porous medium is studied. This is known as the one‐phase Muskat problem and is mathematically equivalent to the vertical Hele‐Shaw problem driven by gravity force. We prove that if the initial free boundary is the graph of a periodic Lipschitz function, then there exists a global‐in‐time Lipschitz solution in the strong sense and it is the unique viscosity solution. The proof requires quantitative estimates for layer potentials and pointwise elliptic regularity in Lipschitz domains. This is the first construction of unique global strong solutions for the Muskat problem with initial data of arbitrary size.
Free, publiclyaccessible full text available December 1, 2024 
We study the problem of locally private mean estimation of highdimensional vectors in the Euclidean ball. Existing algorithms for this problem either incur suboptimal error or have high communication and/or runtime complexity. We propose a new algorithmic framework, ProjUnit, for private mean estimation that yields algorithms that are computationally efficient, have low communication complexity, and incur optimal error up to a 1+o(1)factor. Our framework is deceptively simple: each randomizer projects its input to a random lowdimensional subspace, normalizes the result, and then runs an optimal algorithm such as PrivUnitG in the lowerdimensional space. In addition, we show that, by appropriately correlating the random projection matrices across devices, we can achieve fast server runtime. We mathematically analyze the error of the algorithm in terms of properties of the random projections, and study two instantiations. Lastly, our experiments for private mean estimation and private federated learning demonstrate that our algorithms empirically obtain nearly the same utility as optimal ones while having significantly lower communication and computational cost.more » « less

In this work, we study the convergence in high probability of clipped gradient methods when the noise distribution has heavy tails, i.e., with bounded $p$th moments, for some $1< p \leq 2$. Prior works in this setting follow the same recipe of using concentration inequalities and an inductive argument with union bound to bound the iterates across all iterations. This method results in an increase in the failure probability by a factor of $T$, where $T$ is the number of iterations. We instead propose a new analysis approach based on bounding the moment generating function of a well chosen supermartingale sequence. We improve the dependency on $T$ in the convergence guarantee for a wide range of algorithms with clipped gradients, including stochastic (accelerated) mirror descent for convex objectives and stochastic gradient descent for nonconvex objectives. Our high probability bounds achieve the optimal convergence rates and match the best currently known inexpectation bounds. Our approach naturally allows the algorithms to use timevarying step sizes and clipping parameters when the time horizon is unknown, which appears difficult or even impossible using existing techniques from prior works. Furthermore, we show that in the case of clipped stochastic mirror descent, several problem constants, including the initial distance to the optimum, are not required when setting step sizes and clipping parameters.more » « lessFree, publiclyaccessible full text available December 10, 2024

In this work, we revisit the generalization error of stochastic mirror descent for quadratically bounded losses studied in Telgarsky (2022). Quadratically bounded losses is a broad class of loss functions, capturing both Lipschitz and smooth functions, for both regression and classification problems. We study the high probability generalization for this class of losses on linear predictors in both realizable and nonrealizable cases when the data are sampled IID or from a Markov chain. The prior work relies on an intricate coupling argument between the iterates of the original problem and those projected onto a bounded domain. This approach enables blackbox application of concentration inequalities, but also leads to suboptimal guarantees due in part to the use of a union bound across all iterations. In this work, we depart significantly from the prior work of Telgarsky (2022), and introduce a novel approach for establishing high probability generalization guarantees. In contrast to the prior work, our work directly analyzes the moment generating function of a novel supermartingale sequence and leverages the structure of stochastic mirror descent. As a result, we obtain improved bounds in all aforementioned settings. Specifically, in the realizable case and nonrealizable case with lighttailed subGaussian data, we improve the bounds by a $\log T$ factor, matching the correct rates of $1/T$ and $1/\sqrt{T}$, respectively. In the more challenging case of heavytailed polynomial data, we improve the existing bound by a $\mathrm{poly}\ T$ factor.more » « lessFree, publiclyaccessible full text available September 21, 2024


We study the problem of locally private mean estimation of highdimensional vectors in the Euclidean ball. Existing algorithms for this problem either incur suboptimal error or have high communication and/or runtime complexity. We propose a new algorithmic framework, ProjUnit, for private mean estimation that yields algorithms that are computationally efficient, have low communication complexity, and incur optimal error up to a 1 + o(1)factor. Our framework is deceptively simple: each randomizer projects its input to a random lowdimensional subspace, normalizes the result, and then runs an optimal algorithm such as PrivUnitG in the lowerdimensional space. In addition, we show that, by appropriately correlating the random projection matrices across devices, we can achieve fast server runtime. We mathematically analyze the error of the algorithm in terms of properties of the random projections, and study two instantiations. Lastly, our experiments for private mean estimation and private federated learning demonstrate that our algorithms empirically obtain nearly the same utility as optimal ones while having significantly lower communication and computational cost.more » « lessFree, publiclyaccessible full text available September 21, 2024

Estimating frequencies of elements appearing in a data stream is a key task in largescale data analysis. Popular sketching approaches to this problem (e.g., CountMin and CountSketch) come with worstcase guarantees that probabilistically bound the error of the estimated frequencies for any possible input. The work of Hsu et al.~(2019) introduced the idea of using machine learning to tailor sketching algorithms to the specific data distribution they are being run on. In particular, their learningaugmented frequency estimation algorithm uses a learned heavyhitter oracle which predicts which elements will appear many times in the stream. We give a novel algorithm, which in some parameter regimes, already theoretically outperforms the learning based algorithm of Hsu et al. without the use of any predictions. Augmenting our algorithm with heavyhitter predictions further reduces the error and improves upon the state of the art. Empirically, our algorithms achieve superior performance in all experiments compared to prior approaches.more » « lessFree, publiclyaccessible full text available September 21, 2024

In this work, we study the problem of privately maximizing a submodular function in the streaming setting. Extensive work has been done on privately maximizing submodular functions in the general case when the function depends upon the private data of individuals. However, when the size of the data stream drawn from the domain of the objective function is large or arrives very fast, one must privately optimize the objective within the constraints of the streaming setting. We establish fundamental differentially private baselines for this problem and then derive better tradeoffs between privacy and utility for the special case of decomposable submodular functions. A submodular function is decomposable when it can be written as a sum of submodular functions; this structure arises naturally when each summand function models the utility of an individual and the goal is to study the total utility of the whole population as in the wellknown Combinatorial Public Projects Problem. Finally, we complement our theoretical analysis with experimental corroboration.more » « lessFree, publiclyaccessible full text available July 23, 2024