skip to main content


Title: Robustness Analysis of Loop-Free Floating-Point Programs via Symbolic Automatic Differentiation
Automated techniques for analyzing floating-point code for roundoff error as well as control-flow instability are of growing importance. It is important to compute rigorous estimates of roundoff error, as well as determine the extent of control-flow instability due to roundoff error flowing into conditional statements. Currently available analysis techniques are either non-rigorous or do not produce tight roundoff error bounds in many practical situations. Our approach embodied in a new tool called \seesaw employs {\em symbolic reverse-mode automatic differentiation}, smoothly handling conditionals, and offering tight error bounds. Key steps in \seesaw include weakening conditionals to accommodate roundoff error, computing a symbolic error function that depends on program paths taken, and optimizing this function whose domain may be non-rectangular by paving it with a rectangle-based cover. Our benchmarks cover many practical examples for which such rigorous analysis has hitherto not been applied, or has yielded inferior results.  more » « less
Award ID(s):
1704715
NSF-PAR ID:
10294505
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
IEEE Cluster 2021
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null ; null (Ed.)
    Automated techniques for analyzing floating-point code for roundoff error as well as control-flow instability are of growing importance. It is important to compute rigorous estimates of roundoff error, as well as determine the extent of control-flow instability due to roundoff error flowing into conditional statements. Currently available analysis techniques are either non-rigorous or do not produce tight roundoff error bounds in many practical situations. Our approach embodied in a new tool called \seesaw employs {\em symbolic reverse-mode automatic differentiation}, smoothly handling conditionals, and offering tight error bounds. Key steps in \seesaw include weakening conditionals to accommodate roundoff error, computing a symbolic error function that depends on program paths taken, and optimizing this function whose domain may be non-rectangular by paving it with a rectangle-based cover. Our benchmarks cover many practical examples for which such rigorous analysis has hitherto not been applied, or has yielded inferior results. 
    more » « less
  2. INTRODUCTION Solving quantum many-body problems, such as finding ground states of quantum systems, has far-reaching consequences for physics, materials science, and chemistry. Classical computers have facilitated many profound advances in science and technology, but they often struggle to solve such problems. Scalable, fault-tolerant quantum computers will be able to solve a broad array of quantum problems but are unlikely to be available for years to come. Meanwhile, how can we best exploit our powerful classical computers to advance our understanding of complex quantum systems? Recently, classical machine learning (ML) techniques have been adapted to investigate problems in quantum many-body physics. So far, these approaches are mostly heuristic, reflecting the general paucity of rigorous theory in ML. Although they have been shown to be effective in some intermediate-size experiments, these methods are generally not backed by convincing theoretical arguments to ensure good performance. RATIONALE A central question is whether classical ML algorithms can provably outperform non-ML algorithms in challenging quantum many-body problems. We provide a concrete answer by devising and analyzing classical ML algorithms for predicting the properties of ground states of quantum systems. We prove that these ML algorithms can efficiently and accurately predict ground-state properties of gapped local Hamiltonians, after learning from data obtained by measuring other ground states in the same quantum phase of matter. Furthermore, under a widely accepted complexity-theoretic conjecture, we prove that no efficient classical algorithm that does not learn from data can achieve the same prediction guarantee. By generalizing from experimental data, ML algorithms can solve quantum many-body problems that could not be solved efficiently without access to experimental data. RESULTS We consider a family of gapped local quantum Hamiltonians, where the Hamiltonian H ( x ) depends smoothly on m parameters (denoted by x ). The ML algorithm learns from a set of training data consisting of sampled values of x , each accompanied by a classical representation of the ground state of H ( x ). These training data could be obtained from either classical simulations or quantum experiments. During the prediction phase, the ML algorithm predicts a classical representation of ground states for Hamiltonians different from those in the training data; ground-state properties can then be estimated using the predicted classical representation. Specifically, our classical ML algorithm predicts expectation values of products of local observables in the ground state, with a small error when averaged over the value of x . The run time of the algorithm and the amount of training data required both scale polynomially in m and linearly in the size of the quantum system. Our proof of this result builds on recent developments in quantum information theory, computational learning theory, and condensed matter theory. Furthermore, under the widely accepted conjecture that nondeterministic polynomial-time (NP)–complete problems cannot be solved in randomized polynomial time, we prove that no polynomial-time classical algorithm that does not learn from data can match the prediction performance achieved by the ML algorithm. In a related contribution using similar proof techniques, we show that classical ML algorithms can efficiently learn how to classify quantum phases of matter. In this scenario, the training data consist of classical representations of quantum states, where each state carries a label indicating whether it belongs to phase A or phase B . The ML algorithm then predicts the phase label for quantum states that were not encountered during training. The classical ML algorithm not only classifies phases accurately, but also constructs an explicit classifying function. Numerical experiments verify that our proposed ML algorithms work well in a variety of scenarios, including Rydberg atom systems, two-dimensional random Heisenberg models, symmetry-protected topological phases, and topologically ordered phases. CONCLUSION We have rigorously established that classical ML algorithms, informed by data collected in physical experiments, can effectively address some quantum many-body problems. These rigorous results boost our hopes that classical ML trained on experimental data can solve practical problems in chemistry and materials science that would be too hard to solve using classical processing alone. Our arguments build on the concept of a succinct classical representation of quantum states derived from randomized Pauli measurements. Although some quantum devices lack the local control needed to perform such measurements, we expect that other classical representations could be exploited by classical ML with similarly powerful results. How can we make use of accessible measurement data to predict properties reliably? Answering such questions will expand the reach of near-term quantum platforms. Classical algorithms for quantum many-body problems. Classical ML algorithms learn from training data, obtained from either classical simulations or quantum experiments. Then, the ML algorithm produces a classical representation for the ground state of a physical system that was not encountered during training. Classical algorithms that do not learn from data may require substantially longer computation time to achieve the same task. 
    more » « less
  3. Virtually all real-valued computations are carried out using floating-point data types and operations. With increasing emphasis on overall computational efficiency, compilers are increasingly attempting to optimize floating-point expressions. Practical reasoning about the correctness of these optimizations requires error analysis procedures that are rigorous (ideally, they can generate proof certificates), can handle a wide variety of operators (e.g., transcendentals), and handle all normal programmatic constructs (e.g., conditionals and loops). Unfortunately, none of today’s approaches can achieve this combination. This position paper summarizes recent progress achieved in the community on this topic. It then showcases the component techniques present within our own rigorous floating-point precision tuning framework called FPTuner—essentially offering a collection of “grab and go” tools that others can benefit from. Finally, we present FPTuner’s limitations and describe how we can exploit contemporaneous research to improve it.

     
    more » « less
  4. null (Ed.)
    The approximate degree of a Boolean function f is the least degree of a real polynomial that approximates f pointwise to error at most 1/3. The approximate degree of f is known to be a lower bound on the quantum query complexity of f (Beals et al., FOCS 1998 and J. ACM 2001). We find tight or nearly tight bounds on the approximate degree and quantum query complexities of several basic functions. Specifically, we show the following. k-Distinctness: For any constant k, the approximate degree and quantum query complexity of the k-distinctness function is Ω(n3/4−1/(2k)). This is nearly tight for large k, as Belovs (FOCS 2012) has shown that for any constant k, the approximate degree and quantum query complexity of k-distinctness is O(n3/4−1/(2k+2−4)). Image size testing: The approximate degree and quantum query complexity of testing the size of the image of a function [n]→[n] is Ω~(n1/2). This proves a conjecture of Ambainis et al. (SODA 2016), and it implies tight lower bounds on the approximate degree and quantum query complexity of the following natural problems. k-Junta testing: A tight Ω~(k1/2) lower bound for k-junta testing, answering the main open question of Ambainis et al. (SODA 2016). Statistical distance from uniform: A tight Ω~(n1/2) lower bound for approximating the statistical distance of a distribution from uniform, answering the main question left open by Bravyi et al. (STACS 2010 and IEEE Trans. Inf. Theory 2011). Shannon entropy: A tight Ω~(n1/2) lower bound for approximating Shannon entropy up to a certain additive constant, answering a question of Li and Wu (2017). Surjectivity: The approximate degree of the surjectivity function is Ω~(n3/4). The best prior lower bound was Ω(n2/3). Our result matches an upper bound of O~(n3/4) due to Sherstov (STOC 2018), which we reprove using different techniques. The quantum query complexity of this function is known to be Θ(n) (Beame and Machmouchi, Quantum Inf. Comput. 2012 and Sherstov, FOCS 2015). Our upper bound for surjectivity introduces new techniques for approximating Boolean functions by low-degree polynomials. Our lower bounds are proved by significantly refining techniques recently introduced by Bun and Thaler (FOCS 2017). 
    more » « less
  5. While standard statistical inference techniques and machine learning generalization bounds assume that tests are run on data selected independently of the hypotheses, practical data analysis and machine learning are usually iterative and adaptive processes where the same holdout data is often used for testing a sequence of hypotheses (or models), which may each depend on the outcome of the previous tests on the same data. In this work, we present RADABOUND a rigorous, efficient and practical procedure for controlling the generalization error when using a holdout sample for multiple adaptive testing. Our solution is based on a new application of the Rademacher Complexity generalization bounds, adapted to dependent tests. We demonstrate the statistical power and practicality of our method through extensive simulations and comparisons to alternative approaches. In particular, we show that our rigorous solution is a substantially more powerful and efficient than the differential privacy based approach proposed in Dwork et al. [1]-[3]. 
    more » « less