The standard definition of PAC learning (Valiant 1984) requires learners to succeed under all distributions - even ones that are intractable to sample from. This stands in contrast to samplable PAC learning (Blum, Furst, Kearns, and Lipton 1993), where learners only have to succeed under samplable distributions. We study this distinction and show that samplable PAC substantially expands the power of efficient learners. We first construct a concept class that requires exponential sample complexity in standard PAC but is learnable with polynomial sample complexity in samplable PAC. We then lift this statistical separation to the computational setting and obtain a separation relative to a random oracle. Our proofs center around a new complexity primitive, explicit evasive sets, that we introduce and study. These are sets for which membership is easy to determine but are extremely hard to sample from. Our results extend to the online setting to similarly show that its landscape changes when the adversary is assumed to be efficient instead of computationally unbounded.
more »
« less
A Computational Separation between Private Learning and Online Learning
A recent line of work has shown a qualitative equivalence between differentially private PAC learning and online learning: A concept class is privately learnable if and only if it is online learnable with a finite mistake bound. However, both directions of this equivalence incur significant losses in both sample and computational efficiency. Studying a special case of this connection, Gonen, Hazan, and Moran (NeurIPS 2019) showed that uniform or highly sample-efficient pure-private learners can be time-efficiently compiled into online learners. We show that, assuming the existence of one-way functions, such an efficient conversion is impossible even for general pure-private learners with polynomial sample complexity. This resolves a question of Neel, Roth, and Wu (FOCS 2019).
more »
« less
- Award ID(s):
- 1947889
- PAR ID:
- 10205754
- Date Published:
- Journal Name:
- 34th Conference on Neural Information Processing Systems
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)We prove that every concept class with finite Littlestone dimension can be learned by an (approximate) differentially-private algorithm. This answers an open question of Alon et al. (STOC 2019) who proved the converse statement (this question was also asked by Neel et al. (FOCS 2019)). Together these two results yield an equivalence between online learnability and private PAC learnability. We introduce a new notion of algorithmic stability called “global stability” which is essential to our proof and may be of independent interest. We also discuss an application of our results to boosting the privacy and accuracy parameters of differentially-private learners.more » « less
-
null (Ed.)We introduce a simple framework for designing private boosting algorithms. We give natural conditions under which these algorithms are differentially private, efficient, and noise-tolerant PAC learners. To demonstrate our framework, we use it to construct noise-tolerant and private PAC learners for large-margin halfspaces whose sample complexity does not depend on the dimension. We give two sample complexity bounds for our large-margin halfspace learner. One bound is based only on differential privacy, and uses this guarantee as an asset for ensuring generalization. This first bound illustrates a general methodology for obtaining PAC learners from privacy, which may be of independent interest. The second bound uses standard techniques from the theory of large-margin classification (the fat-shattering dimension) to match the best known sample complexity for differentially private learning of large-margin halfspaces, while additionally tolerating random label noise.more » « less
-
Belkin, M.; Kpotufe, S. (Ed.)We study the problem of robust learning under clean-label data-poisoning attacks, where the at-tacker injects (an arbitrary set of) correctly-labeled examples to the training set to fool the algorithm into making mistakes on specific test instances at test time. The learning goal is to minimize the attackable rate (the probability mass of attackable test instances), which is more difficult than optimal PAC learning. As we show, any robust algorithm with diminishing attackable rate can achieve the optimal dependence on ε in its PAC sample complexity, i.e., O(1/ε). On the other hand, the attackable rate might be large even for some optimal PAC learners, e.g., SVM for linear classifiers. Furthermore, we show that the class of linear hypotheses is not robustly learnable when the data distribution has zero margin and is robustly learnable in the case of positive margin but requires sample complexity exponential in the dimension. For a general hypothesis class with bounded VC dimension, if the attacker is limited to add at most t >0 poison examples, the optimal robust learning sample complexity grows almost linearly with t.more » « less
-
A private learner is trained on a sample of labeled points and generates a hypothesis that can be used for predicting the labels of newly sampled points while protecting the privacy of the training set [Kasiviswannathan et al., FOCS 2008]. Past research uncovered that private learners may need to exhibit significantly higher sample complexity than non-private learners as is the case of learning of one-dimensional threshold functions [Bun et al., FOCS 2015, Alon et al., STOC 2019]. We explore prediction as an alternative to learning. A predictor answers a stream of classification queries instead of outputting a hypothesis. Earlier work has considered a private prediction model with a single classification query [Dwork and Feldman, COLT 2018]. We observe that when answering a stream of queries, a predictor must modify the hypothesis it uses over time, and in a manner that cannot rely solely on the training set. We introduce private everlasting prediction taking into account the privacy of both the training set and the (adaptively chosen) queries made to the predictor. We then present a generic construction of private everlasting predictors in the PAC model. The sample complexity of the initial training sample in our construction is quadratic (up to polylog factors) in the VC dimension of the concept class. Our construction allows prediction for all concept classes with finite VC dimension, and in particular threshold functions over infinite domains, for which (traditional) private learning is known to be impossible.more » « less
An official website of the United States government

