We study the problem of differentially private stochastic convex optimization (DP-SCO) with heavy-tailed gradients, where we assume a kth-moment bound on the Lipschitz constants of sample functions rather than a uniform bound. We propose a new reduction-based approach that enables us to obtain the first optimal rates (up to logarithmic factors) in the heavy-tailed setting, achieving error G2⋅1n√+Gk⋅(d√nϵ)1−1k under (ϵ,δ)-approximate differential privacy, up to a mild $$\textup{polylog}(\frac{1}{\delta})$$ factor, where G22 and Gkk are the 2nd and kth moment bounds on sample Lipschitz constants, nearly-matching a lower bound of [Lowy and Razaviyayn 2023]. We further give a suite of private algorithms in the heavy-tailed setting which improve upon our basic result under additional assumptions, including an optimal algorithm under a known-Lipschitz constant assumption, a near-linear time algorithm for smooth functions, and an optimal linear time algorithm for smooth generalized linear models. 
                        more » 
                        « less   
                    
                            
                            Private Stochastic Convex Optimization with Heavy Tails: Near-Optimality from Simple Reductions
                        
                    
    
            This paper studies differentially private stochastic convex optimization (DP-SCO) in the presence of heavy-tailed gradients, where only a 𝑘 kth-moment bound on sample Lipschitz constants is assumed, instead of a uniform bound. The authors propose a reduction-based approach that achieves the first near-optimal error rates (up to logarithmic factors) in this setting. Specifically, under ( 𝜖 , 𝛿 ) (ϵ,δ)-approximate differential privacy, they achieve an error bound of 𝐺 2 𝑛 + 𝐺 𝑘 ⋅ ( 𝑑 𝑛 𝜖 ) 1 − 1 𝑘 , n  G 2   +G k  ⋅( nϵ d   ) 1− k 1  , up to a mild polylogarithmic factor in 1 𝛿 δ 1  , where 𝐺 2 G 2  and 𝐺 𝑘 G k  are the 2nd and 𝑘 kth moment bounds on sample Lipschitz constants. This nearly matches the lower bound established by Lowy and Razaviyayn (2023). Beyond the basic result, the authors introduce a suite of private algorithms that further improve performance under additional assumptions: an optimal algorithm under a known-Lipschitz constant, a near-linear time algorithm for smooth functions, and an optimal linear-time algorithm for smooth generalized linear models. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2505865
- PAR ID:
- 10631953
- Publisher / Repository:
- https://doi.org/10.48550/arXiv.2406.02789
- Date Published:
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            null (Ed.)Indistinguishability obfuscation, introduced by [Barak et. al. Crypto2001], aims to compile programs into unintelligible ones while preserving functionality. It is a fascinating and powerful object that has been shown to enable a host of new cryptographic goals and beyond. However, constructions of indistinguishability obfuscation have remained elusive, with all other proposals relying on heuristics or newly conjectured hardness assumptions. In this work, we show how to construct indistinguishability obfuscation from subexponential hardness of four well-founded assumptions. We prove: Informal Theorem: Let 𝜏∈ (0,∞), 𝛿∈ (0,1), 𝜖∈ (0,1) be arbitrary constants. Assume sub-exponential security of the following assumptions: - the Learning With Errors (LWE) assumption with subexponential modulus-to-noise ratio 2^{𝑘^𝜖} and noises of magnitude polynomial in 𝑘,where 𝑘 is the dimension of the LWE secret, - the Learning Parity with Noise (LPN) assumption over general prime fields Z𝑝 with polynomially many LPN samples and error rate 1/ℓ^𝛿 ,where ℓ is the dimension of the LPN secret, - the existence of a Boolean Pseudo-Random Generator (PRG) in NC0 with stretch 𝑛^{1+𝜏}, where 𝑛 is the length of the PRG seed, - the Decision Linear (DLIN) assumption on symmetric bilinear groups of prime order. Then, (subexponentially secure) indistinguishability obfuscation for all polynomial-size circuits exists. Further, assuming only polynomial security of the aforementioned assumptions, there exists collusion resistant public-key functional encryption for all polynomial-size circuits.more » « less
- 
            We consider the problem of spherical Gaussian Mixture models with 𝑘≥3 components when the components are well separated. A fundamental previous result established that separation of Ω(log𝑘‾‾‾‾‾√) is necessary and sufficient for identifiability of the parameters with \textit{polynomial} sample complexity (Regev and Vijayaraghavan, 2017). In the same context, we show that 𝑂̃ (𝑘𝑑/𝜖2) samples suffice for any 𝜖≲1/𝑘, closing the gap from polynomial to linear, and thus giving the first optimal sample upper bound for the parameter estimation of well-separated Gaussian mixtures. We accomplish this by proving a new result for the Expectation-Maximization (EM) algorithm: we show that EM converges locally, under separation Ω(log𝑘‾‾‾‾‾√). The previous best-known guarantee required Ω(𝑘‾‾√) separation (Yan, et al., 2017). Unlike prior work, our results do not assume or use prior knowledge of the (potentially different) mixing weights or variances of the Gaussian components. Furthermore, our results show that the finite-sample error of EM does not depend on non-universal quantities such as pairwise distances between means of Gaussian components.more » « less
- 
            Traditional models of supervised learning require a learner, given examples from an arbitrary joint distribution on 𝑅 𝑑 × { ± 1 } R d ×{±1}, to output a hypothesis that competes (to within 𝜖 ϵ) with the best fitting concept from a class. To overcome hardness results for learning even simple concept classes, this paper introduces a smoothed-analysis framework that only requires competition with the best classifier robust to small random Gaussian perturbations. This subtle shift enables a wide array of learning results for any concept that (1) depends on a low-dimensional subspace (multi-index model) and (2) has bounded Gaussian surface area. This class includes functions of halfspaces and low-dimensional convex sets, which are only known to be learnable in non-smoothed settings with respect to highly structured distributions like Gaussians. The analysis also yields new results for traditional non-smoothed frameworks such as learning with margin. In particular, the authors present the first algorithm for agnostically learning intersections of 𝑘 k-halfspaces in time 𝑘 ⋅ poly ( log  𝑘 , 𝜖 , 𝛾 ) k⋅poly(logk,ϵ,γ), where 𝛾 γ is the margin parameter. Previously, the best-known runtime was exponential in 𝑘 k (Arriaga and Vempala, 1999).more » « less
- 
            null (Ed.)In this work we consider the problem of online submodular maximization under a cardinality constraint with differential privacy (DP). A stream of T submodular functions over a common finite ground set U arrives online, and at each time-step the decision maker must choose at most k elements of U before observing the function. The decision maker obtains a profit equal to the function evaluated on the chosen set and aims to learn a sequence of sets that achieves low expected regret. In the full-information setting, we develop an (𝜀,𝛿)-DP algorithm with expected (1-1/e)-regret bound of 𝑂(𝑘2log|𝑈|𝑇log𝑘/𝛿√𝜀). This algorithm contains k ordered experts that learn the best marginal increments for each item over the whole time horizon while maintaining privacy of the functions. In the bandit setting, we provide an (𝜀,𝛿+𝑂(𝑒−𝑇1/3))-DP algorithm with expected (1-1/e)-regret bound of 𝑂(log𝑘/𝛿√𝜀(𝑘(|𝑈|log|𝑈|)1/3)2𝑇2/3). One challenge for privacy in this setting is that the payoff and feedback of expert i depends on the actions taken by her i-1 predecessors. This particular type of information leakage is not covered by post-processing, and new analysis is required. Our techniques for maintaining privacy with feedforward may be of independent interest.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    