Let L be a language that can be decided in linear space and let ϵ>0 be any constant. Let A be the exponential hardness assumption that for every n, membership in L for inputs of length n cannot be decided by circuits of size smaller than 2ϵn. We prove that for every function f:{0,1}∗→{0,1}, computable by a randomized logspace algorithm R, there exists a deterministic logspace algorithm D (attempting to compute f), such that on every input x of length n, the algorithm D outputs one of the following:1)The correct value f(x).2)The string: “I am unable to compute f(x) because the hardness assumption A is false”, followed by a (provenly correct) circuit of size smaller than 2ϵn′ for membership in L for inputs of length n′, for some n′=Θ(logn); that is, a circuit that refutes A. Moreover, D is explicitly constructed, given R.We note that previous works on the hardness-versus-randomness paradigm give derandomized algorithms that rely blindly on the hardness assumption. If the hardness assumption is false, the algorithms may output incorrect values, and thus a user cannot trust that an output given by the algorithm is correct. Instead, our algorithm D verifies the computation so that it never outputs an incorrect value. Thus, if D outputs a value for f(x), that value is certified to be correct. Moreover, if D does not output a value for f(x), it alerts that the hardness assumption was found to be false, and refutes the assumption.Our next result is a universal derandomizer for BPL (the class of problems solvable by bounded-error randomized logspace algorithms) 1 : We give a deterministic algorithm U that takes as an input a randomized logspace algorithm R and an input x and simulates the computation of R on x, deteriministically. Under the widely believed assumption BPL=L, the space ...
more »
« less
A priori generalization error analysis of two-layer neural networks for solving high dimensional Schrödinger eigenvalue problems
This paper analyzes the generalization error of two-layer neural networks for computing the ground state of the Schrödinger operator on a d d -dimensional hypercube with Neumann boundary condition. We prove that the convergence rate of the generalization error is independent of dimension d d , under the a priori assumption that the ground state lies in a spectral Barron space. We verify such assumption by proving a new regularity estimate for the ground state in the spectral Barron space. The latter is achieved by a fixed point argument based on the Krein-Rutman theorem.
more »
« less
- PAR ID:
- 10324294
- Date Published:
- Journal Name:
- Communications of the American Mathematical Society
- Volume:
- 2
- Issue:
- 1
- ISSN:
- 2692-3688
- Page Range / eLocation ID:
- 1 to 21
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
A<sc>bstract</sc> An important problem in Quantum Field Theory (QFT) is to understand the structures of observables on spacetime manifolds of nontrivial topology. Such observables arise naturally when studying physical systems at finite temperature and/or finite volume and encode subtle properties of the underlying microscopic theory that are often obscure on the flat spacetime. Locality of the QFT implies that these observables can be constructed from more basic building blocks by cutting-and-gluing along a spatial slice, where a crucial ingredient is the Hilbert space on the spatial manifold. In Conformal Field Theory (CFT), thanks to the operator-state correspondence, we have a non-perturbative understanding of the Hilbert space on a spatial sphere. However it remains a challenge to consider more general spatial manifolds. Here we study CFTs in spacetime dimensionsd >2 on the spatial manifoldT2× ℝd−3which is one of the simplest manifolds beyond the spherical topology. We focus on the ground state in this Hilbert space and analyze universal properties of the ground state energy, also commonly known as the Casimir energy, which is a nontrivial function of the complex structure moduliτof the torus. The Casimir energy is subject to constraints from modular invariance on the torus which we spell out using PSL(2,ℤ) spectral theory. Moreover we derive a simple universal formula for the Casimir energy in the thin torus limit using the effective field theory (EFT) from Kaluza-Klein reduction of the CFT, with exponentially small corrections from worldline instantons. We illustrate our formula with explicit examples from well-known CFTs including the criticalO(N) model ind= 3 and holographic CFTs ind ≥3.more » « less
-
Domain generalization (DG) aims to train a model to perform well in unseen domains under different distributions. This paper considers a more realistic yet more challenging scenario, namely Single Domain Generalization (Single-DG), where only a single source domain is available for training. To tackle this challenge, we first try to understand when neural networks fail to generalize? We empirically ascertain a property of a model that correlates strongly with its generalization that we coin as model sensitivity. Based on our analysis, we propose a novel strategy of Spectral Adversarial Data Augmentation (SADA) to generate augmented images targeted at the highly sensitive frequencies. Models trained with these hard-to-learn samples can effectively suppress the sensitivity in the frequency space, which leads to improved generalization performance. Extensive experiments on multiple public datasets demonstrate the superiority of our approach, which surpasses the state-of-the-art single-DG methods by up to 2.55%. The source code is available at https://github.com/DIAL-RPI/Spectral-Adversarial-Data-Augmentation.more » « less
-
We show that, for any ∊ > 0, there is a deterministic embedding of edge-weighted planar graphs of diameter D into bounded-treewidth graphs. The embedding has additive error ∊D. We use this construction to obtain the first efficient bicriteria approximation schemes for weighted planar graphs addressing k-Center (equivalently d-Domination), and a metric generalization of independent set, d-independent SET. The approximation schemes employ a metric generalization of Baker's framework that is based on our embedding result.more » « less
-
null (Ed.)We study overparameterization in generative adversarial networks (GANs) that can interpolate the training data. We show that overparameterization can improve generalization performance and accelerate the training process. We study the generalization error as a function of latent space dimension and identify two main behaviors, depending on the learning setting. First, we show that overparameterized generative models that learn distributions by minimizing a metric or f-divergence do not exhibit double descent in generalization errors; specifically, all the interpolating solutions achieve the same generalization error. Second, we develop a new pseudo-supervised learning approach for GANs where the training utilizes pairs of fabricated (noise) inputs in conjunction with real output samples. Our pseudo-supervised setting exhibits double descent (and in some cases, triple descent) of generalization errors. We combine pseudo-supervision with overparameterization (i.e., overly large latent space dimension) to accelerate training while performing better, or close to, the generalization performance without pseudo-supervision. While our analysis focuses mostly on linear GANs, we also apply important insights for improving generalization of nonlinear, multilayer GANs.more » « less
An official website of the United States government

