skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Efficient Private Algorithms for Learning Large-Margin Halfspaces
We present new differentially private algorithms for learning a large-margin halfspace. In contrast to previous algorithms, which are based on either differentially private simulations of the statistical query model or on private convex optimization, the sample complexity of our algorithms depends only on the margin of the data, and not on the dimension. We complement our results with a lower bound, showing that the dependence of our upper bounds on the margin is optimal.  more » « less
Award ID(s):
1750716
PAR ID:
10212624
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of Machine Learning Research
Volume:
117
ISSN:
2640-3498
Page Range / eLocation ID:
704-724
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    We introduce a simple framework for designing private boosting algorithms. We give natural conditions under which these algorithms are differentially private, efficient, and noise-tolerant PAC learners. To demonstrate our framework, we use it to construct noise-tolerant and private PAC learners for large-margin halfspaces whose sample complexity does not depend on the dimension. We give two sample complexity bounds for our large-margin halfspace learner. One bound is based only on differential privacy, and uses this guarantee as an asset for ensuring generalization. This first bound illustrates a general methodology for obtaining PAC learners from privacy, which may be of independent interest. The second bound uses standard techniques from the theory of large-margin classification (the fat-shattering dimension) to match the best known sample complexity for differentially private learning of large-margin halfspaces, while additionally tolerating random label noise. 
    more » « less
  2. Meka, Raghu (Ed.)
    Differential privacy and sublinear algorithms are both rapidly emerging algorithmic themes in times of big data analysis. Although recent works have shown the existence of differentially private sublinear algorithms for many problems including graph parameter estimation and clustering, little is known regarding hardness results on these algorithms. In this paper, we initiate the study of lower bounds for problems that aim for both differentially-private and sublinear-time algorithms. Our main result is the incompatibility of both the desiderata in the general case. In particular, we prove that a simple problem based on one-way marginals yields both a differentially-private algorithm, as well as a sublinear-time algorithm, but does not admit a "strictly" sublinear-time algorithm that is also differentially private. 
    more » « less
  3. We consider the problem of checking the differential privacy of online randomized algorithms that process a stream of inputs and produce outputs corresponding to each input. This paper generalizes an automaton model called DiP automata [10] to describe such algorithms by allowing multiple real-valued storage variables. A DiP automaton is a parametric automaton whose behavior depends on the privacy budget ∈. An automaton A will be said to be differentially private if, for some D, the automaton is D∈-differentially private for all values of ∈ > 0. We identify a precise characterization of the class of all differentially private DiP automata. We show that the problem of determining if a given DiP automaton belongs to this class is PSPACE-complete. Our PSPACE algorithm also computes a value for D when the given automaton is differentially private. The algorithm has been implemented, and experiments demonstrating its effectiveness are presented. 
    more » « less
  4. Mikołaj Boja´nczyk, Emanuela Merelli (Ed.)
    We initiate a systematic study of algorithms that are both differentially-private and run in sublinear time for several problems in which the goal is to estimate natural graph parameters. Our main result is a differentially-private $$(1+\rho)$$-approximation algorithm for the problem of computing the average degree of a graph, for every $$\rho>0$$. The running time of the algorithm is roughly the same (for sparse graphs) as its non-private version proposed by Goldreich and Ron (Sublinear Algorithms, 2005). We also obtain the first differentially-private sublinear-time approximation algorithms for the maximum matching size and the minimum vertex cover size of a graph. An overarching technique we employ is the notion of \emph{coupled global sensitivity} of randomized algorithms. Related variants of this notion of sensitivity have been used in the literature in ad-hoc ways. Here we formalize the notion and develop it as a unifying framework for privacy analysis of randomized approximation algorithms. 
    more » « less
  5. null (Ed.)
    Differential privacy is a formal, mathematical def- inition of data privacy that has gained traction in academia, industry, and government. The task of correctly constructing differentially private algorithms is non-trivial, and mistakes have been made in foundational algorithms. Currently, there is no automated support for converting an existing, non-private program into a differentially private version. In this paper, we propose a technique for automatically learning an accurate and differentially private version of a given non-private program. We show how to solve this difficult program synthesis problem via a combination of techniques: carefully picking representative example inputs, reducing the problem to continuous optimization, and mapping the results back to symbolic expressions. We demonstrate that our approach is able to learn foundational al- gorithms from the differential privacy literature and significantly outperforms natural program synthesis baselines. 
    more » « less