skip to main content


Title: Efficient Co-Training of Linear Separators under Weak Dependence
We develop the first polynomial-time algorithm for co-training of homogeneous linear separators under \em weak dependence, a relaxation of the condition of independence given the label. Our algorithm learns from purely unlabeled data, except for a single labeled example to break symmetry of the two classes, and works for any data distribution having an inverse-polynomial margin and with center of mass at the origin.  more » « less
Award ID(s):
1525971 1331175
PAR ID:
10057811
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Conference on Learning Theory (COLT)
Volume:
65
Page Range / eLocation ID:
302-318
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    We present approximation and exact algorithms for piecewise regression of univariate and bivariate data using fixed-degree polynomials. Specifically, given a set S of n data points (x1, y1), . . . , (xn, yn) ∈ Rd × R where d ∈ {1, 2}, the goal is to segment xi’s into some (arbitrary) number of disjoint pieces P1, . . . , Pk, where each piece Pj is associated with a fixed-degree polynomial fj : Rd → R, to minimize the total loss function λk+􏰄ni=1(yi −f(xi))2, where λ ≥ 0 is a regularization term that penalizes model complexity (number of pieces) and f : 􏰇kj=1 Pj → R is the piecewise polynomial function defined as f|Pj = fj. The pieces P1,...,Pk are disjoint intervals of R in the case of univariate data and disjoint axis-aligned rectangles in the case of bivariate data. Our error approximation allows use of any fixed-degree polynomial, not just linear functions. Our main results are the following. For univariate data, we present a (1 + ε)-approximation algorithm with time complexity O(nε log1ε), assuming that data is presented in sorted order of xi’s. For bivariate data, we √ present three results: a sub-exponential exact algorithm with running time nO( n); a polynomial-time constant- approximation algorithm; and a quasi-polynomial time approximation scheme (QPTAS). The bivariate case is believed to be NP-hard in the folklore but we could not find a published record in the literature, so in this paper we also present a hardness proof for completeness. 
    more » « less
  2. In this paper, we consider the inverse graph filtering process when the original filter is a polynomial of some graph shift on a simple connected graph. The Chebyshev polynomial approximation of high order has been widely used to approximate the inverse filter. In this paper, we propose an iterative Chebyshev polynomial approximation (ICPA) algorithm to implement the inverse filtering procedure, which is feasible to eliminate the restoration error even using Chebyshev polynomial approximation of lower order. We also provide a detailed convergence analysis for the ICPA algorithm and a distributed implementation of the ICPA algorithm on a spatially distributed network. Numerical results are included to demonstrate the satisfactory performance of the ICPA algorithm in graph signal denoising. 
    more » « less
  3. We show a simple reduction which demonstrates the cryptographic hardness of learning a single periodic neuron over isotropic Gaussian distributions in the pres- ence of noise. More precisely, our reduction shows that any polynomial-time algorithm (not necessarily gradient-based) for learning such functions under small noise implies a polynomial-time quantum algorithm for solving worst-case lattice problems, whose hardness form the foundation of lattice-based cryptography. Our core hard family of functions, which are well-approximated by one-layer neural networks, take the general form of a univariate periodic function applied to an affine projection of the data. These functions have appeared in previous seminal works which demonstrate their hardness against gradient-based (Shamir’18), and Statisti- cal Query (SQ) algorithms (Song et al.’17). We show that if (polynomially) small noise is added to the labels, the intractability of learning these functions applies to all polynomial-time algorithms, beyond gradient-based and SQ algorithms, under the aforementioned cryptographic assumptions. Moreover, we demonstrate the necessity of noise in the hardness result by designing a polynomial-time algorithm for learning certain families of such functions under exponentially small adversarial noise. Our proposed algorithm is not a gradient-based or an SQ algorithm, but is rather based on the celebrated Lenstra-Lenstra-Lovász (LLL) lattice basis reduction algorithm. Furthermore, in the absence of noise, this algorithm can be directly applied to solve CLWE detection (Bruna et al.’21) and phase retrieval with an optimal sample complexity of d + 1 samples. In the former case, this improves upon the quadratic-in-d sample complexity required in (Bruna et al.’21). 
    more » « less
  4. null (Ed.)
    Abstract We show a simple reduction which demonstrates the cryptographic hardness of learning a single periodic neuron over isotropic Gaussian distributions in the presence of noise. More precisely, our reduction shows that any polynomial-time algorithm (not necessarily gradientbased) for learning such functions under small noise implies a polynomial-time quantum algorithm for solving worst-case lattice problems, whose hardness form the foundation of lattice-based cryptography. Our core hard family of functions, which are well-approximated by one-layer neural networks, take the general form of a univariate periodic function applied to an affine projection of the data. These functions have appeared in previous seminal works which demonstrate their hardness against gradient-based (Shamir’18), and Statistical Query (SQ) algorithms (Song et al.’17). We show that if (polynomially) small noise is added to the labels, the intractability of learning these functions applies to all polynomial-time algorithms, beyond gradient-based and SQ algorithms, under the aforementioned cryptographic assumptions. Moreover, we demonstrate the necessity of noise in the hardness result by designing a polynomial-time algorithm for learning certain families of such functions under exponentially small adversarial noise. Our proposed algorithm is not a gradient-based or an SQ algorithm, but is rather based on the celebrated Lenstra-Lenstra-Lovász (LLL) lattice basis reduction algorithm. Furthermore, in the absence of noise, this algorithm can be directly applied to solve CLWE detection (Bruna et al.’21) and phase retrieval with an optimal sample complexity of d + 1 samples. In the former case, this improves upon the quadratic-in-d sample complexity required in (Bruna et al.’21). 
    more » « less
  5. We study the problem of maximizing a non-monotone submodular function subject to a cardinality constraint in the streaming model. Our main contribution is a single-pass (semi-)streaming algorithm that uses roughly $O(k / \varepsilon^2)$ memory, where $k$ is the size constraint. At the end of the stream, our algorithm post-processes its data structure using any offline algorithm for submodular maximization, and obtains a solution whose approximation guarantee is $\frac{\alpha}{1+\alpha}-\varepsilon$, where $\alpha$ is the approximation of the offline algorithm. If we use an exact (exponential time) post-processing algorithm, this leads to $\frac{1}{2}-\varepsilon$ approximation (which is nearly optimal). If we post-process with the algorithm of \cite{buchbinder2019constrained}, that achieves the state-of-the-art offline approximation guarantee of $\alpha=0.385$, we obtain $0.2779$-approximation in polynomial time, improving over the previously best polynomial-time approximation of $0.1715$ due to \cite{feldman2018less}. It is also worth mentioning that our algorithm is combinatorial and deterministic, which is rare for an algorithm for non-monotone submodular maximization, and enjoys a fast update time of $O(\frac{\log k + \log (1/\alpha {\varepsilon^2})$ per element. 
    more » « less