Abstract We show a simple reduction which demonstrates the cryptographic hardness of learning a single periodic neuron over isotropic Gaussian distributions in the presence of noise. More precisely, our reduction shows that any polynomialtime algorithm (not necessarily gradientbased) for learning such functions under small noise implies a polynomialtime quantum algorithm for solving worstcase lattice problems, whose hardness form the foundation of latticebased cryptography. Our core hard family of functions, which are wellapproximated by onelayer neural networks, take the general form of a univariate periodic function applied to an affine projection of the data. These functions have appeared in previous seminal works which demonstrate their hardness against gradientbased (Shamir’18), and Statistical Query (SQ) algorithms (Song et al.’17). We show that if (polynomially) small noise is added to the labels, the intractability of learning these functions applies to all polynomialtime algorithms, beyond gradientbased and SQ algorithms, under the aforementioned cryptographic assumptions. Moreover, we demonstrate the necessity of noise in the hardness result by designing a polynomialtime algorithm for learning certain families of such functions under exponentially small adversarial noise. Our proposed algorithm is not a gradientbased or an SQ algorithm, but is rather based on the celebrated LenstraLenstraLovász (LLL) lattice basis reductionmore »
On the Power of Learning from kWise Queries
Several wellstudied models of access to data samples, including statistical queries, local differential privacy and lowcommunication algorithms rely on queries that provide information about a function of a single sample. (For example, a statistical query (SQ) gives an estimate of $\E_{x\sim D}[q(x)]$ for any choice of the query function $q:X\rightarrow \R$, where $D$ is an unknown data distribution.) Yet some data analysis algorithms rely on properties of functions that depend on multiple samples. Such algorithms would be naturally implemented using $k$wise queries each of which is specified by a function $q:X^k\rightarrow \R$. Hence it is natural to ask whether algorithms using $k$wise queries can solve learning problems more efficiently and by how much.
Blum, Kalai, Wasserman~\cite{blum2003noise} showed that for any weak PAC learning problem over a fixed distribution, the complexity of learning with $k$wise SQs is smaller than the (unary) SQ complexity by a factor of at most $2^k$. We show that for more general problems over distributions the picture is substantially richer. For every $k$, the complexity of distributionindependent PAC learning with $k$wise queries can be exponentially larger than learning with $(k+1)$wise queries. We then give two approaches for simulating a $k$wise query using unary queries. The first approach exploits more »
 Publication Date:
 NSFPAR ID:
 10026311
 Journal Name:
 Innovations in Theoretical Computer Science (ITCS)
 Sponsoring Org:
 National Science Foundation
More Like this


We show a simple reduction which demonstrates the cryptographic hardness of learning a single periodic neuron over isotropic Gaussian distributions in the pres ence of noise. More precisely, our reduction shows that any polynomialtime algorithm (not necessarily gradientbased) for learning such functions under small noise implies a polynomialtime quantum algorithm for solving worstcase lattice problems, whose hardness form the foundation of latticebased cryptography. Our core hard family of functions, which are wellapproximated by onelayer neural networks, take the general form of a univariate periodic function applied to an affine projection of the data. These functions have appeared in previous seminal works which demonstrate their hardness against gradientbased (Shamir’18), and Statisti cal Query (SQ) algorithms (Song et al.’17). We show that if (polynomially) small noise is added to the labels, the intractability of learning these functions applies to all polynomialtime algorithms, beyond gradientbased and SQ algorithms, under the aforementioned cryptographic assumptions. Moreover, we demonstrate the necessity of noise in the hardness result by designing a polynomialtime algorithm for learning certain families of such functions under exponentially small adversarial noise. Our proposed algorithm is not a gradientbased or an SQ algorithm, but is rather based on the celebrated LenstraLenstraLovász (LLL) lattice basismore »

Kraus, Andreas (Ed.)In this paper we study the fundamental problems of maximizing a continuous nonmonotone submodular function over the hypercube, both with and without coordinatewise concavity. This family of optimization problems has several applications in machine learning, economics, and communication systems. Our main result is the first 1 2 approximation algorithm for continuous submodular function maximization; this approximation factor of 1 2 is the best possible for algorithms that only query the objective function at polynomially many points. For the special case of DRsubmodular maximization, i.e. when the submodular function is also coordinatewise concave along all coordinates, we provide a different 1 2 approximation algorithm that runs in quasilinear time. Both these results improve upon prior work (Bian et al., 2017a,b; Soma and Yoshida, 2017). Our first algorithm uses novel ideas such as reducing the guaranteed approximation problem to analyzing a zerosum game for each coordinate, and incorporates the geometry of this zerosum game to fix the value at this coordinate. Our second algorithm exploits coordinatewise concavity to identify a monotone equilibrium condition sufficient for getting the required approximation guarantee, and hunts for the equilibrium point using binary search. We further run experiments to verify the performance of our proposed algorithms inmore »

Estimating the normalizing constant of an unnormalized probability distribution has important applications in computer science, statistical physics, machine learning, and statistics. In this work, we consider the problem of estimating the normalizing constant to within a multiplication factor of 1 ± ε for a μstrongly convex and Lsmooth function f, given query access to f(x) and ∇f(x). We give both algorithms and lowerbounds for this problem. Using an annealing algorithm combined with a multilevel Monte Carlo method based on underdamped Langevin dynamics, we show that O(d^{4/3}/\eps^2) queries to ∇f are sufficient. Moreover, we provide an information theoretic lowerbound, showing that at least d^{1o(1)}/\eps^{2o(1)} queries are necessary. This provides a first nontrivial lowerbound for the problem.

We give new quantum algorithms for evaluating composed functions whose inputs may be shared between bottomlevel gates. Let f be an m bit Boolean function and consider an n bit function F obtained by applying f to conjunctions of possibly overlapping subsets of n variables. If f has quantum query complexity Q ( f ) , we give an algorithm for evaluating F using O ~ ( Q ( f ) ⋅ n ) quantum queries. This improves on the bound of O ( Q ( f ) ⋅ n ) that follows by treating each conjunction independently, and our bound is tight for worstcase choices of f . Using completely different techniques, we prove a similar tight composition theorem for the approximate degree of f .By recursively applying our composition theorems, we obtain a nearly optimal O ~ ( n 1 − 2 − d ) upper bound on the quantum query complexity and approximate degree of linearsize depth d AC 0 circuits. As a consequence, such circuits can be PAC learned in subexponential time, even in the challenging agnostic setting. Prior to our work, a subexponentialtime algorithm was not known even for linearsize depth3 AC 0 circuits.As anmore »