skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: The power of "why" and "why not": enriching scenario exploration with provenance
Scenario-inding tools like the Alloy Analyzer are widely used in numerous concrete domains like security, network analysis, UML analysis, and so on. They can help to verify properties and, more generally, aid in exploring a system’s behavior. While scenario inders are valuable for their ability to produce concrete examples, individual scenarios only give insight into what is possible, leaving the user to make their own conclusions about what might be necessary. This paper enriches scenario inding by allowing users to ask łwhy?ž and łwhy not?ž questions about the examples they are given. We show how to distinguish parts of an example that cannot be consistently removed (or changed) from those that merely relect underconstraint in the speciication. In the former case we show how to determine which elements of the speciication and which other components of the example together explain the presence of such facts. This paper formalizes the act of computing provenance in scenario- inding. We present Amalgam, an extension of the popular Alloy scenario-inder, which implements these foundations and provides interactive exploration of examples. We also evaluate Amalgam’s algorithmics on a variety of both textbook and real-world examples.  more » « less
Award ID(s):
1714431
PAR ID:
10055775
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering
Page Range / eLocation ID:
106 to 116
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Model-finding tools like the Alloy Analyzer produce concrete examples of how a declarative specification can be satisfied. These formal tools are useful in a wide range of domains: software design, security, networking, and more. By producing concrete examples, they assist in exploring system behavior and can help find surprising faults. Specifications usually have many potential candidate solutions, but model- finders tend to leave the choice of which examples to present entirely to the underlying solver. This paper closes that gap by exploring notions of coverage for the model-finding domain, yielding a novel, rigorous metric for output quality. These ideas are realized in the tool CompoSAT, which interposes itself between Alloy’s constraint-solving and presentation stages to produce ensembles of examples that maximize coverage. We show that high-coverage ensembles like CompoSAT produces are useful for, among other things, detecting overconstraint—a particularly insidious form of specification error. We detail the underlying theory and implementation of CompoSAT and evaluate it on numerous specifications. 
    more » « less
  2. Objective: This paper demonstrates computational modeling’s value as a tool for mapping the impact of hidden variables and evaluating the accuracy of statistical methods in bioarchaeology. Materials: As a working example, this paper presents an agent-based model of a 1,000-person cohort of in- dividuals who can form an unspecified skeletal lesion at any age between birth and ten years and enter a simulated cemetery at the end of their lives. Skeletal lesions either have no effect on mortality risk (scenario 1) or are associated with doubled mortality risk (scenario 2). Methods: The agent-based model simulates data on individual age at death and lesion status. Kaplan-Meier survival analysis is run on each simulated dataset, comparing survival estimates for individuals with and without lesions. Results: Survival analyses underestimate the true value of lesion-associated mortality risk in early life in scenario 2 and produce a false lesion-associated survival advantage under the null conditions of scenario 1. Conclusions: Researchers should account for the ages of a skeletal lesion’s developmental window, where known, when assessing lesion-associated mortality. Survival analyses return accurate results when they exclude in- dividuals in the ages of active lesion formation. Significance: Modeling experiments can identify which archaeologically unmeasurable variables have the greatest impact on estimates of population health and outline the ways in which they bias estimates of past health from the skeletal record. Limitations: The only limits on modeling are limits of imagination and common sense. Suggestions for future research: Many other archaeologically hidden variables remain to be explored with this approach. 
    more » « less
  3. Inclusive design appears rarely, if at all, in most undergraduate computer science (CS) curricula. As a result, many CS students graduate without knowing how to apply inclusive design to the software they build, and go on to careers that perpetuate the proliferation of software that excludes communities of users. Our panel of CS faculty will explain how we have been working to address this problem. For the past several years, we have been integrating bits of inclusive design in multiple courses in CS undergraduate programs, which has had very positive impacts on students' ratings of their instructors, students' ratings of the education climate, and students' retention. The panel's content will be mostly concrete examples of how we are doing this so that attendees can leave with an in-the-trenches understanding of what this looks like for CS faculty across specialization areas and classes. We 
    more » « less
  4. Many data analysis and design problems involve reasoning about points in high-dimensional space. A common strategy is to embed points from this high-dimensional space into a low-dimensional one. As we will show in this paper, a critical property of good embeddings is that they preserve isometry — i.e., preserving the geodesic distance between points on the original data manifold within their embedded locations in the latent space. However, enforcing isometry is non-trivial for common Neural embedding models, such as autoencoders and generative models. Moreover, while theoretically appealing, it is not clear to what extent enforcing isometry is really necessary for a given design or analysis task. This paper answers these questions by constructing an isometric embedding via an isometric autoencoder, which we employ to analyze an inverse airfoil design problem. Specifically, the paper describes how to train an isometric autoencoder and demonstrates its usefulness compared to non-isometric autoencoders on both simple pedagogical examples and for airfoil embeddings using the UIUC airfoil dataset. Our ablation study illustrates that enforcing isometry is necessary to accurately discover latent space clusters — a common analysis method researchers typically perform on low-dimensional embeddings. We also show how isometric autoencoders can uncover pathologies in typical gradient-based Shape Optimization solvers through an analysis on the SU2-optimized airfoil dataset, wherein we find an over-reliance of the gradient solver on angle of attack. Overall, this paper motivates the use of isometry constraints in Neural embedding models, particularly in cases where researchers or designer intend to use distance-based analysis measures (such as clustering, k-Nearest Neighbors methods, etc.) to analyze designs within the latent space. While this work focuses on airfoil design as an illustrative example, it applies to any domain where analyzing isometric design or data embeddings would be useful. 
    more » « less
  5. Abstract Topological data analysis (TDA) is a tool from data science and mathematics that is beginning to make waves in environmental science. In this work, we seek to provide an intuitive and understandable introduction to a tool from TDA that is particularly useful for the analysis of imagery, namely, persistent homology. We briefly discuss the theoretical background but focus primarily on understanding the output of this tool and discussing what information it can glean. To this end, we frame our discussion around a guiding example of classifying satellite images from the sugar, fish, flower, and gravel dataset produced for the study of mesoscale organization of clouds by Rasp et al. We demonstrate how persistent homology and its vectorization, persistence landscapes, can be used in a workflow with a simple machine learning algorithm to obtain good results, and we explore in detail how we can explain this behavior in terms of image-level features. One of the core strengths of persistent homology is how interpretable it can be, so throughout this paper we discuss not just the patterns we find but why those results are to be expected given what we know about the theory of persistent homology. Our goal is that readers of this paper will leave with a better understanding of TDA and persistent homology, will be able to identify problems and datasets of their own for which persistent homology could be helpful, and will gain an understanding of the results they obtain from applying the included GitHub example code. Significance StatementInformation such as the geometric structure and texture of image data can greatly support the inference of the physical state of an observed Earth system, for example, in remote sensing to determine whether wildfires are active or to identify local climate zones. Persistent homology is a branch of topological data analysis that allows one to extract such information in an interpretable way—unlike black-box methods like deep neural networks. The purpose of this paper is to explain in an intuitive manner what persistent homology is and how researchers in environmental science can use it to create interpretable models. We demonstrate the approach to identify certain cloud patterns from satellite imagery and find that the resulting model is indeed interpretable. 
    more » « less