Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
We introduce a novel family of expander-based error correcting codes. These codes can be sampled with randomness linear in the block-length, and achieve list decoding capacity (among other local properties). Our expander-based codes can be made starting from any family of sufficiently low-bias codes, and as a consequence, we give the first construction of a family of algebraic codes that can be sampled with linear randomness and achieve list-decoding capacity. We achieve this by introducing the notion of a pseudorandom puncturing of a code, where we select n indices of a base code C ⊂ 𝔽_q^m in a correlated fashion. Concretely, whereas a random linear code (i.e. a truly random puncturing of the Hadamard code) requires O(n log(m)) random bits to sample, we sample a pseudorandom linear code with O(n + log (m)) random bits by instantiating our pseudorandom puncturing as a length n random walk on an exapnder graph on [m]. In particular, we extend a result of Guruswami and Mosheiff (FOCS 2022) and show that a pseudorandom puncturing of a small-bias code satisfies the same local properties as a random linear code with high probability. As a further application of our techniques, we also show that pseudorandom puncturings of Reed-Solomon codes are list-recoverable beyond the Johnson bound, extending a result of Lund and Potukuchi (RANDOM 2020). We do this by instead analyzing properties of codes with large distance, and show that pseudorandom puncturings still work well in this regime.more » « less
-
We give the first tester-learner for halfspaces that succeeds universally over a wide class of structured distributions. Our universal tester-learner runs in fully polynomial time and has the following guarantee: the learner achieves error O(opt)+ϵ on any labeled distribution that the tester accepts, and moreover, the tester accepts whenever the marginal is any distribution that satisfies a Poincare inequality. In contrast to prior work on testable learning, our tester is not tailored to any single target distribution but rather succeeds for an entire target class of distributions. The class of Poincare distributions includes all strongly log-concave distributions, and, assuming the Kannan--Lovasz--Simonovits (KLS) conjecture, includes all log-concave distributions. In the special case where the label noise is known to be Massart, our tester-learner achieves error opt+ϵ while accepting all log-concave distributions unconditionally (without assuming KLS).Our tests rely on checking hypercontractivity of the unknown distribution using a sum-of-squares (SOS) program, and crucially make use of the fact that Poincare distributions are certifiably hypercontractive in the SOS framework.more » « less
-
In this paper, we introduce a new, spectral notion of approximation between directed graphs, which we call singular value (SV) approximation. SV-approximation is stronger than previous notions of spectral approximation considered in the literature, including spectral approximation of Laplacians for undirected graphs [ST04], standard approximation for directed graphs [CKP + 17], and unit-circle (UC) approximation for directed graphs [AKM + 20]. Further, SV approximation enjoys several useful properties not possessed by previous notions of approximation, e.g., it is preserved under products of randomwalk matrices and bounded matrices. We provide a nearly linear-time algorithm for SV-sparsifying (and hence UC-sparsifying) Eulerian directed graphs, as well as ℓ-step random walks on such graphs, for any ℓ≤poly(n). Combined with the Eulerian scaling algorithms of [CKK + 18], given an arbitrary (not necessarily Eulerian) directed graph and a set S of vertices, we can approximate the stationary probability mass of the (S,Sc) cut in an ℓ-step random walk to within a multiplicative error of 1/polylog(n) and an additive error of 1/poly(n) in nearly linear time. As a starting point for these results, we provide a simple black-box reduction from SV-sparsifying Eulerian directed graphs to SV-sparsifying undirected graphs; such a directed-to-undirected reduction was not known for previous notions of spectral approximation.more » « less
-
We give the first agnostic, efficient, proper learning algorithm for monotone Boolean functions. Given 2O~(n√/ε) uniformly random examples of an unknown function f:{±1}n→{±1}, our algorithm outputs a hypothesis g:{±1}n→{±1} that is monotone and (opt +ε)-close to f, where opt is the distance from f to the closest monotone function. The running time of the algorithm (and consequently the size and evaluation time of the hypothesis) is also 2O~(n√/ε), nearly matching the lower bound of [13]. We also give an algorithm for estimating up to additive error ε the distance of an unknown function f to monotone using a run-time of 2O~(n√/ε). Previously, for both of these problems, sample-efficient algorithms were known, but these algorithms were not run-time efficient. Our work thus closes this gap in our knowledge between the run-time and sample complexity.This work builds upon the improper learning algorithm of [17] and the proper semiagnostic learning algorithm of [40], which obtains a non-monotone Boolean-valued hypothesis, then “corrects” it to monotone using query-efficient local computation algorithms on graphs. This black-box correction approach can achieve no error better than 2 opt +ε information-theoretically; we bypass this barrier bya)augmenting the improper learner with a convex optimization step, andb)learning and correcting a real-valued function before rounding its values to Boolean. Our real-valued correction algorithm solves the “poset sorting” problem of [40] for functions over general posets with non-Boolean labels.more » « less
-
Let L be a language that can be decided in linear space and let ϵ>0 be any constant. Let A be the exponential hardness assumption that for every n, membership in L for inputs of length n cannot be decided by circuits of size smaller than 2ϵn. We prove that for every function f:{0,1}∗→{0,1}, computable by a randomized logspace algorithm R, there exists a deterministic logspace algorithm D (attempting to compute f), such that on every input x of length n, the algorithm D outputs one of the following:1)The correct value f(x).2)The string: “I am unable to compute f(x) because the hardness assumption A is false”, followed by a (provenly correct) circuit of size smaller than 2ϵn′ for membership in L for inputs of length n′, for some n′=Θ(logn); that is, a circuit that refutes A. Moreover, D is explicitly constructed, given R.We note that previous works on the hardness-versus-randomness paradigm give derandomized algorithms that rely blindly on the hardness assumption. If the hardness assumption is false, the algorithms may output incorrect values, and thus a user cannot trust that an output given by the algorithm is correct. Instead, our algorithm D verifies the computation so that it never outputs an incorrect value. Thus, if D outputs a value for f(x), that value is certified to be correct. Moreover, if D does not output a value for f(x), it alerts that the hardness assumption was found to be false, and refutes the assumption.Our next result is a universal derandomizer for BPL (the class of problems solvable by bounded-error randomized logspace algorithms) 1 : We give a deterministic algorithm U that takes as an input a randomized logspace algorithm R and an input x and simulates the computation of R on x, deteriministically. Under the widely believed assumption BPL=L, the space ...more » « less
-
We give new upper and lower bounds on the power of several restricted classes of arbitrary-order read-once branching programs (ROBPs) and standard-order ROBPs (SOBPs) that have received significant attention in the literature on pseudorandomness for space-bounded computation. - Regular SOBPs of length n and width ⌊w(n+1)/2⌋ can exactly simulate general SOBPs of length n and width w, and moreover an n/2-o(n) blow-up in width is necessary for such a simulation. Our result extends and simplifies prior average-case simulations (Reingold, Trevisan, and Vadhan (STOC 2006), Bogdanov, Hoza, Prakriya, and Pyne (CCC 2022)), in particular implying that weighted pseudorandom generators (Braverman, Cohen, and Garg (SICOMP 2020)) for regular SOBPs of width poly(n) or larger automatically extend to general SOBPs. Furthermore, our simulation also extends to general (even read-many) oblivious branching programs. - There exist natural functions computable by regular SOBPs of constant width that are average-case hard for permutation SOBPs of exponential width. Indeed, we show that Inner-Product mod 2 is average-case hard for arbitrary-order permutation ROBPs of exponential width. - There exist functions computable by constant-width arbitrary-order permutation ROBPs that are worst-case hard for exponential-width SOBPs. - Read-twice permutation branching programs of subexponential width can simulate polynomial-width arbitrary-order ROBPs.more » « less
-
A spanner of a graph is a subgraph that preserves lengths of shortest paths up to a multiplicative distortion. For every k, a spanner with size O(n^{1+1/k}) and stretch (2k+1) can be constructed by a simple centralized greedy algorithm, and this is tight assuming Erdős girth conjecture. In this paper we study the problem of constructing spanners in a local manner, specifically in the Local Computation Model proposed by Rubinfeld et al. (ICS 2011). We provide a randomized Local Computation Agorithm (LCA) for constructing (2r-1)-spanners with Õ(n^{1+1/r}) edges and probe complexity of Õ(n^{1-1/r}) for r ∈ {2,3}, where n denotes the number of vertices in the input graph. Up to polylogarithmic factors, in both cases, the stretch factor is optimal (for the respective number of edges). In addition, our probe complexity for r = 2, i.e., for constructing a 3-spanner, is optimal up to polylogarithmic factors. Our result improves over the probe complexity of Parter et al. (ITCS 2019) that is Õ(n^{1-1/2r}) for r ∈ {2,3}. Both our algorithms and the algorithms of Parter et al. use a combination of neighbor-probes and pair-probes in the above-mentioned LCAs. For general k ≥ 1, we provide an LCA for constructing O(k²)-spanners with Õ(n^{1+1/k}) edges using O(n^{2/3}Δ²) neighbor-probes, improving over the Õ(n^{2/3}Δ⁴) algorithm of Parter et al. By developing a new randomized LCA for graph decomposition, we further improve the probe complexity of the latter task to be O(n^{2/3-(1.5-α)/k}Δ²), for any constant α > 0. This latter LCA may be of independent interest.more » « less
-
Interval scheduling is a basic problem in the theory of algorithms and a classical task in combinatorial optimization. We develop a set of techniques for partitioning and grouping jobs based on their starting and ending times, that enable us to view an instance of interval scheduling on many jobs as a union of multiple interval scheduling instances, each containing only a few jobs. Instantiating these techniques in dynamic and local settings of computation leads to several new results. For (1+ε)-approximation of job scheduling of n jobs on a single machine, we develop a fully dynamic algorithm with O((log n)/ε) update and O(log n) query worst-case time. Further, we design a local computation algorithm that uses only O((log N)/ε) queries when all jobs are length at least 1 and have starting/ending times within [0,N]. Our techniques are also applicable in a setting where jobs have rewards/weights. For this case we design a fully dynamic deterministic algorithm whose worst-case update and query time are poly(log n,1/ε). Equivalently, this is the first algorithm that maintains a (1+ε)-approximation of the maximum independent set of a collection of weighted intervals in poly(log n,1/ε) time updates/queries. This is an exponential improvement in 1/ε over the running time of a randomized algorithm of Henzinger, Neumann, and Wiese [SoCG, 2020], while also removing all dependence on the values of the jobs' starting/ending times and rewards, as well as removing the need for any randomness. We also extend our approaches for interval scheduling on a single machine to examine the setting with M machines.more » « less
-
We give a deterministic white-box algorithm to estimate the expectation of a read-once branching program of length n and width w in space Õ(logn+√logn·logw). In particular, we obtain an almost optimal space Õ(logn) derandomization of programs up to width w=2√logn. Previously, the best known space complexity for this problem was O(min{logn· logw,log3/2n+√logn· logw}) via the classic algorithms of Savitch (JCSS 1970) and Saks and Zhou (JCSS 1999), which only achieve space Õ(logn) for w=polylog(n). We prove this result by showing that a variant of the Saks-Zhou algorithm developed by Cohen, Doron, and Sberlo (ECCC 2022) still works without executing one of the steps in the algorithm, the so-called random shift step. This allows us to extend their algorithm from computing the nth power of a w× w stochastic matrix to multiplying n distinct w× w stochastic matrices with no degradation in space consumption. In the regime where w≥ n, we also show that our approach can achieve parameters matching those of the original Saks-Zhou algorithm (with no loglog factors). Finally, we show that for w≤ 2√logn, an algorithm even simpler than our algorithm and that of Saks and Zhou achieves space O(log3/2 n).more » « less
-
There are many important high dimensional function classes that have fast agnostic learning algorithms when strong assumptions on the distribution of examples can be made, such as Gaussianity or uniformity over the domain. But how can one be sufficiently confident that the data indeed satisfies the distributional assumption, so that one can trust in the output quality of the agnostic learning algorithm? We propose a model by which to systematically study the design of tester-learner pairs (A,T), such that if the distribution on examples in the data passes the tester T then one can safely trust the output of the agnostic learner A on the data. To demonstrate the power of the model, we apply it to the classical problem of agnostically learning halfspaces under the standard Gaussian distribution and present a tester-learner pair with a combined run-time of nÕ(1/є4). This qualitatively matches that of the best known ordinary agnostic learning algorithms for this task. In contrast, finite sample Gaussian distribution testers do not exist for the L1 and EMD distance measures. Previously it was known that half-spaces are well-approximated with low-degree polynomials relative to the Gaussian distribution. A key step in our analysis is showing that this is the case even relative to distributions whose low-degree moments approximately match those of a Gaussian. We also go beyond spherically-symmetric distributions, and give a tester-learner pair for halfspaces under the uniform distribution on {0,1}n with combined run-time of nÕ(1/є4). This is achieved using polynomial approximation theory and critical index machinery of [Diakonikolas, Gopalan, Jaiswal, Servedio, and Viola 2009]. Can one design agnostic learning algorithms under distributional assumptions and count on future technical work to produce, as a matter of course, tester-learner pairs with similar run-time? Our answer is a resounding no, as we show there exist some well-studied settings for which 2Õ(√n) run-time agnostic learning algorithms are available, yet the combined run-times of tester-learner pairs must be as high as 2Ω(n). On that account, the design of tester-learner pairs is a research direction in its own right independent of standard agnostic learning. To be specific, our lower bounds apply to the problems of agnostically learning convex sets under the Gaussian distribution and for monotone Boolean functions under the uniform distribution over {0,1}n.more » « less