skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Multi-Level Analysis of Compiler-Induced Variability and Performance Tradeoffs
Successful HPC software applications are long-lived. When ported across machines and their compilers, these applications often produce different numerical results, many of which are unacceptable. Such variability is also a concern while optimizing the code more aggressively to gain performance. Efficient tools that help locate the program units (files and functions) within which most of the variability occurs are badly needed, both to plan for code ports and to root-cause errors due to variability when they happen in the field. In this work, we offer an enhanced version of the open-source testing framework FLiT to serve these roles. Key new features of FLiT include a suite of bisection algorithms that help locate the root causes of variability. Another added feature allows an analysis of the tradeoffs between performance and the degree of variability. Our new contributions also include a collection of case studies. Results on the MFEM finite-element library include variability/performance tradeoffs, and the identification of a (hitherto unknown) abnormal level of result-variability even under mild compiler optimizations. Results from studying the Laghos proxy application include identifying a significantly divergent floating-point result-variability and successful root-causing down to the problematic function over as little as 14 program executions. Finally, in an evaluation of 4,376 controlled injections of floating-point perturbations on the LULESH proxy application, we showed that the FLiT framework has 100% precision and recall in discovering the file and function locations of the injections all within an average of only 15 program executions.  more » « less
Award ID(s):
1817073 1704715
PAR ID:
10101376
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
HPDC '19 Proceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing
Page Range / eLocation ID:
61 to 72
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null; null (Ed.)
    New abstractions and frameworks are born when one creates hard-coded solutions to important tasks, regardless of whether they scale or result in software that can be meaningfully released. This paper describes our experience creating such a light-weight framework out of a previous tool effort FLiT for detecting compiler-induced numerical variability. The resulting framework FLOAT has already helped us better understand and fix performance bugs in FLiT. Our design of FLOAT and the ways in which we anticipate it enabling the adoption and re-purposing of FLiT, though likely not exhaustive, are described. We also express our views on the appropriate scope of such an approach, especially given that variations of compilation, linking, and execution abound, and specializing in that domain may be advantageous in the long-term as opposed to investing in an overly generalized paradigm. 
    more » « less
  2. Scientific applications, especially legacy applications, contain a wealth of scientific knowledge. As hardware changes, applications need to be ported to new architectures and extended to include scientific advances. As a result, it is common to encounter problems like performance bottlenecks and dead code. A visual representation of the dataflow can help performance experts identify and debug such problems. The Computation API of the sparse polyhedral framework (SPF) provides a single entry point for tools to generate and manipulate polyhedral dataflow graphs, and transform applications. However, when viewing graphs generated for scientific applications there are several barriers. The graphs are large, and manipulating their layout to respect execution order is difficult. This paper presents a case study that uses the Computation API to represent a scientific application, GeoAc, in the SPF. Generated polyhedral dataflow graphs were explored for optimization opportunities and limitations were addressed using several graph simplifications to improve their usability. 
    more » « less
  3. The classical universal approximation (UA) theorem for neural networks establishes mild conditions under which a feedforward neural network can approximate a continuous functionfwith arbitrary accuracy. A recent result shows that neural networks also enjoy a more generalintervaluniversal approximation (IUA) theorem, in the sense that the abstract interpretation semantics of the network using the interval domain can approximate the direct image map off(i.e., the result of applyingfto a set of inputs) with arbitrary accuracy. These theorems, however, rest on the unrealistic assumption that the neural network computes over infinitely precise real numbers, whereas their software implementations in practice compute over finite-precision floating-point numbers. An open question is whether the IUA theorem still holds in the floating-point setting. This paper introduces the first IUA theorem forfloating-pointneural networks that proves their remarkable ability toperfectly capturethe direct image map of any rounded target functionf, showing no limits exist on their expressiveness. Our IUA theorem in the floating-point setting exhibits material differences from the real-valued setting, which reflects the fundamental distinctions between these two computational models. This theorem also implies surprising corollaries, which include (i) the existence ofprovably robustfloating-point neural networks; and (ii) thecomputational completenessof the class of straight-line programs that use only floating-point additions and multiplications for the class of all floating-point programs that halt. 
    more » « less
  4. While many real-world programs are shipped with configurations to enable/disable functionalities, fuzzers have mostly been applied to test single configurations of these programs. In this work, we first conduct an empirical study to understand how program configurations affect fuzzing performance. We find that limiting a campaign to a single configuration can result in failing to cover a significant amount of code. We also observe that different program configurations contribute differing amounts of code coverage, challenging the idea that each one can be efficiently fuzzed individually. Motivated by these two observations, we propose ConfigFuzz , which can fuzz configurations along with normal inputs. ConfigFuzz transforms the target program to encode its program options within part of the fuzzable input, so existing fuzzers’ mutation operators can be reused to fuzz program configurations. We instantiate ConfigFuzz on six configurable, common fuzzing targets, and integrate their executions in FuzzBench. In our evaluation, ConfigFuzz outperforms two baseline fuzzers in four targets, while the results are mixed in the other targets due to program size and configuration space. We also analyze the options fuzzed by ConfigFuzz and how they affect the performance. 
    more » « less
  5. null (Ed.)
    With growing access to versatile robotics, it is beneficial for end users to be able to teach robots tasks without needing to code a control policy. One possibility is to teach the robot through successful task executions. However, near-optimal demonstrations of a task can be difficult to provide and even successful demonstrations can fail to capture task aspects key to robust skill replication. Here, we propose a learning from demonstration (LfD) approach that enables learning of robust task definitions without the need for near-optimal demonstrations. We present a novel algorithmic framework for learning task specifications based on the ergodic metric—a measure of information content in motion. Moreover, we make use of negative demonstrations— demonstrations of what not to do—and show that they can help compensate for imperfect demonstrations, reduce the number of demonstrations needed, and highlight crucial task elements improving robot performance. In a proof-of-concept example of cart-pole inversion, we show that negative demonstrations alone can be sufficient to successfully learn and recreate a skill. Through a human subject study with 24 participants, we show that consistently more information about a task can be captured from combined positive and negative (posneg) demonstrations than from the same amount of just positive demonstrations. Finally, we demonstrate our learning approach on simulated tasks of target reaching and table cleaning with a 7-DoF Franka arm. Our results point towards a future with robust, data efficient LfD for novice users. 
    more » « less