 Award ID(s):
 1815011
 NSFPAR ID:
 10289073
 Editor(s):
 Belkin, M.; Kpotufe, S.
 Date Published:
 Journal Name:
 Conference on Learning Theory
 Volume:
 34
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this

Why are classifiers in high dimension vulnerable to “adversarial” perturbations? We show that it is likely not due to information theoretic limitations, but rather it could be due to computational constraints. First we prove that, for a broad set of classification tasks, the mere existence of a robust classifier implies that it can be found by a possibly exponentialtime algorithm with relatively few training examples. Then we give two particular classification tasks where learning a robust classifier is computationally intractable. More precisely we construct two binary classifications task in high dimensional space which are (i) information theoretically easy to learn robustly for large perturbations, (ii) efficiently learnable (nonrobustly) by a simple linear separator, (iii) yet are not efficiently robustly learnable, even for small perturbations. Specifically, for the first task hardness holds for any efficient algorithm in the statistical query (SQ) model, while for the second task we rule out any efficient algorithm under a cryptographic assumption. These examples give an exponential separation between classical learning and robust learning in the statistical query model or under a cryptographic assumption. It suggests that adversarial examples may be an unavoidable byproduct of computational limitations of learning algorithms.more » « less

In reinforcement learning, the classic objectives of maximizing discounted and finitehorizon cumulative rewards are PAClearnable: There are algorithms that learn a nearoptimal policy with high probability using a finite amount of samples and computation. In recent years, researchers have introduced objectives and corresponding reinforcementlearning algorithms beyond the classic cumulative rewards, such as objectives specified as linear temporal logic formulas. However, questions about the PAClearnability of these new objectives have remained open. This work demonstrates the PAClearnability of general reinforcementlearning objectives through sufficient conditions for PAClearnability in two analysis settings. In particular, for the analysis that considers only sample complexity, we prove that if an objective given as an oracle is uniformly continuous, then it is PAClearnable. Further, for the analysis that considers computational complexity, we prove that if an objective is computable, then it is PAClearnable. In other words, if a procedure computes successive approximations of the objective's value, then the objective is PAClearnable. We give three applications of our condition on objectives from the literature with previously unknown PAClearnability and prove that these objectives are PAClearnable. Overall, our result helps verify existing objectives' PAClearnability. Also, as some studied objectives that are not uniformly continuous have been shown to be not PAClearnable, our results could guide the design of new PAClearnable objectives.more » « less

How quickly can a given class of concepts be learned from examples? It is common to measure the performance of a supervised machine learning algorithm by plotting its “learning curve”, that is, the decay of the error rate as a function of the number of training examples. However, the classical theoretical framework for understanding learnability, the PAC model of VapnikChervonenkis and Valiant, does not explain the behavior of learning curves: the distributionfree PAC model of learning can only bound the upper envelope of the learning curves over all possible data distributions. This does not match the practice of machine learning, where the data source is typically fixed in any given scenario, while the learner may choose the number of training examples on the basis of factors such as computational resources and desired accuracy. In this paper, we study an alternative learning model that better captures such practical aspects of machine learning, but still gives rise to a complete theory of the learnable in the spirit of the PAC model. More precisely, we consider the problem of universal learning, which aims to understand the performance of learning algorithms on every data distribution, but without requiring uniformity over the distribution. The main result of this paper is a remarkable trichotomy: there are only three possible rates of universal learning. More precisely, we show that the learning curves of any given concept class decay either at an exponential, linear, or arbitrarily slow rates. Moreover, each of these cases is completely characterized by appropriate combinatorial parameters, and we exhibit optimal learning algorithms that achieve the best possible rate in each case. For concreteness, we consider in this paper only the realizable case, though analogous results are expected to extend to more general learning scenarios.more » « less

In recent years, researchers have made significant progress in devising reinforcementlearning algorithms for optimizing linear temporal logic (LTL) objectives and LTLlike objectives.Despite these advancements, there are fundamental limitations to how well this problem can be solved. Previous studies have alluded to this fact but have not examined it in depth.In this paper, we address the tractability of reinforcement learning for general LTL objectives from a theoretical perspective.We formalize the problem under the probably approximately correct learning in Markov decision processes (PACMDP) framework, a standard framework for measuring sample complexity in reinforcement learning.In this formalization, we prove that the optimal policy for any LTL formula is PACMDPlearnable if and only if the formula is in the most limited class in the LTL hierarchy, consisting of formulas that are decidable within a finite horizon.Practically, our result implies that it is impossible for a reinforcementlearning algorithm to obtain a PACMDP guarantee on the performance of its learned policy after finitely many interactions with an unconstrained environment for LTL objectives that are not decidable within a finite horizon.

In the problem of learning a class ratio from unlabeled data, which we call CR learning, the training data is unlabeled, and only the ratios, or proportions, of examples receiving each label are given. The goal is to learn a hypothesis that predicts the proportions of labels on the distribution underlying the sample. This model of learning is applicable to a wide variety of settings, including predicting the number of votes for candidates in political elections from polls. In this paper, we formally define this class and resolve foundational questions regarding the computational complexity of CR learning and characterize its relationship to PAC learning. Among our results, we show, perhaps surprisingly, that for finite VC classes what can be efficiently CR learned is a strict subset of what can be learned efficiently in PAC, under standard complexity assumptions. We also show that there exist classes of functions whose CR learnability is independent of ZFC, the standard set theoretic axioms. This implies that CR learning cannot be easily characterized (like PAC by VC dimension).more » « less