We show a simple reduction which demonstrates the cryptographic hardness of
learning a single periodic neuron over isotropic Gaussian distributions in the pres
ence of noise. More precisely, our reduction shows that any polynomialtime
algorithm (not necessarily gradientbased) for learning such functions under small
noise implies a polynomialtime quantum algorithm for solving worstcase lattice
problems, whose hardness form the foundation of latticebased cryptography. Our
core hard family of functions, which are wellapproximated by onelayer neural
networks, take the general form of a univariate periodic function applied to an affine
projection of the data. These functions have appeared in previous seminal works
which demonstrate their hardness against gradientbased (Shamir’18), and Statisti
cal Query (SQ) algorithms (Song et al.’17). We show that if (polynomially) small
noise is added to the labels, the intractability of learning these functions applies to
all polynomialtime algorithms, beyond gradientbased and SQ algorithms, under
the aforementioned cryptographic assumptions. Moreover, we demonstrate the
necessity of noise in the hardness result by designing a polynomialtime algorithm
for learning certain families of such functions under exponentially small adversarial
noise. Our proposed algorithm is not a gradientbased or an SQ algorithm, but is
rather based on the celebrated LenstraLenstraLovász (LLL) lattice basis reduction
algorithm. Furthermore, in the absence of noise, this algorithm can be directly
applied to solve CLWE detection (Bruna et al.’21) and phase retrieval with an
optimal sample complexity of d + 1 samples. In the former case, this improves
upon the quadraticind sample complexity required in (Bruna et al.’21).
more »
« less
On the Cryptographic Hardness of Learning Single Periodic Neurons
Abstract
We show a simple reduction which demonstrates the cryptographic hardness of learning
a single periodic neuron over isotropic Gaussian distributions in the presence of noise. More
precisely, our reduction shows that any polynomialtime algorithm (not necessarily gradientbased)
for learning such functions under small noise implies a polynomialtime quantum algorithm
for solving worstcase lattice problems, whose hardness form the foundation of latticebased
cryptography. Our core hard family of functions, which are wellapproximated by onelayer neural
networks, take the general form of a univariate periodic function applied to an affine projection
of the data. These functions have appeared in previous seminal works which demonstrate their
hardness against gradientbased (Shamir’18), and Statistical Query (SQ) algorithms (Song et
al.’17). We show that if (polynomially) small noise is added to the labels, the intractability of
learning these functions applies to all polynomialtime algorithms, beyond gradientbased and
SQ algorithms, under the aforementioned cryptographic assumptions.
Moreover, we demonstrate the necessity of noise in the hardness result by designing a
polynomialtime algorithm for learning certain families of such functions under exponentially
small adversarial noise. Our proposed algorithm is not a gradientbased or an SQ algorithm, but
is rather based on the celebrated LenstraLenstraLovász (LLL) lattice basis reduction algorithm.
Furthermore, in the absence of noise, this algorithm can be directly applied to solve CLWE
detection (Bruna et al.’21) and phase retrieval with an optimal sample complexity of d + 1
samples. In the former case, this improves upon the quadraticind sample complexity required
in (Bruna et al.’21).
more »
« less
 NSFPAR ID:
 10299545
 Date Published:
 Journal Name:
 Advances in neural information processing systems
 ISSN:
 10495258
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this


Clustering is a fundamental primitive in unsupervised learning which gives rise to a rich class of computationallychallenging inference tasks. In this work, we focus on the canonical task of clustering ddimensional Gaussian mixtures with unknown (and possibly degenerate) covariance. Recent works (Ghosh et al. ’20; Mao, Wein ’21; Davis, Diaz, Wang ’21) have established lower bounds against the class of lowdegree polynomial methods and the sumofsquares (SoS) hierarchy for recovering certain hidden structures planted in Gaussian clustering instances. Prior work on many similar inference tasks portends that such lower bounds strongly suggest the presence of an inherent statisticaltocomputational gap for clustering, that is, a parameter regime where the clustering task is statistically possible but no polynomialtime algorithm succeeds. One special case of the clustering task we consider is equivalent to the problem of finding a planted hypercube vector in an otherwise random subspace. We show that, perhaps surprisingly, this particular clustering model does not exhibit a statisticaltocomputational gap, despite the aforementioned lowdegree and SoS lower bounds. To achieve this, we give an algorithm based on Lenstra–Lenstra–Lovász lattice basis reduction which achieves the statisticallyoptimal sample complexity of d + 1 samples. This result extends the class of problems whose conjectured statisticaltocomputational gaps can be “closed” by “brittle” polynomialtime algorithms, highlighting the crucial but subtle role of noise in the onset of statisticaltocomputational gaps.more » « less

null (Ed.)We introduce a continuous analogue of the Learning with Errors (LWE) problem, which we name CLWE. We give a polynomialtime quantum reduction from worstcase lattice problems to CLWE, showing that CLWE enjoys similar hardness guarantees to those of LWE. Alternatively, our result can also be seen as opening new avenues of (quantum) attacks on lattice problems. Our work resolves an open problem regarding the computational complexity of learning mixtures of Gaussians without separability assumptions (Diakonikolas 2016, Moitra 2018). As an additional motivation, (a slight variant of) CLWE was considered in the context of robust machine learning (Diakonikolas et al. FOCS 2017), where hardness in the statistical query (SQ) model was shown; our work addresses the open question regarding its computational hardness (Bubeck et al. ICML 2019).more » « less

null (Ed.)We give the first statisticalquery lower bounds for agnostically learning any nonpolynomial activation with respect to Gaussian marginals (e.g., ReLU, sigmoid, sign). For the specific problem of ReLU regression (equivalently, agnostically learning a ReLU), we show that any statisticalquery algorithm with tolerance n−(1/ϵ)b must use at least 2ncϵ queries for some constant b,c>0, where n is the dimension and ϵ is the accuracy parameter. Our results rule out general (as opposed to correlational) SQ learning algorithms, which is unusual for realvalued learning problems. Our techniques involve a gradient boosting procedure for "amplifying" recent lower bounds due to Diakonikolas et al. (COLT 2020) and Goel et al. (ICML 2020) on the SQ dimension of functions computed by twolayer neural networks. The crucial new ingredient is the use of a nonstandard convex functional during the boosting procedure. This also yields a bestpossible reduction between two commonly studied models of learning: agnostic learning and probabilistic concepts.more » « less

Rothblum, Guy N (Ed.)We study the problem of auditing classifiers for statistical subgroup fairness. Kearns et al. [Kearns et al., 2018] showed that the problem of auditing combinatorial subgroups fairness is as hard as agnostic learning. Essentially all work on remedying statistical measures of discrimination against subgroups assumes access to an oracle for this problem, despite the fact that no efficient algorithms are known for it. If we assume the data distribution is Gaussian, or even merely logconcave, then a recent line of work has discovered efficient agnostic learning algorithms for halfspaces. Unfortunately, the reduction of Kearns et al. was formulated in terms of weak, "distributionfree" learning, and thus did not establish a connection for families such as logconcave distributions. In this work, we give positive and negative results on auditing for Gaussian distributions: On the positive side, we present an alternative approach to leverage these advances in agnostic learning and thereby obtain the first polynomialtime approximation scheme (PTAS) for auditing nontrivial combinatorial subgroup fairness: we show how to audit statistical notions of fairness over homogeneous halfspace subgroups when the features are Gaussian. On the negative side, we find that under cryptographic assumptions, no polynomialtime algorithm can guarantee any nontrivial auditing, even under Gaussian feature distributions, for general halfspace subgroups.more » « less