Model-finding tools like the Alloy Analyzer produce concrete examples of how a declarative specification can be satisfied. These formal tools are useful in a wide range of domains: software design, security, networking, and more. By producing concrete examples, they assist in exploring system behavior and can help find surprising faults. Specifications usually have many potential candidate solutions, but model- finders tend to leave the choice of which examples to present entirely to the underlying solver. This paper closes that gap by exploring notions of coverage for the model-finding domain, yielding a novel, rigorous metric for output quality. These ideas are realized in the tool CompoSAT, which interposes itself between Alloy’s constraint-solving and presentation stages to produce ensembles of examples that maximize coverage. We show that high-coverage ensembles like CompoSAT produces are useful for, among other things, detecting overconstraint—a particularly insidious form of specification error. We detail the underlying theory and implementation of CompoSAT and evaluate it on numerous specifications.
more »
« less
The power of "why" and "why not": enriching scenario exploration with provenance
Scenario-inding tools like the Alloy Analyzer are widely used in numerous concrete domains like security, network analysis, UML analysis, and so on. They can help to verify properties and, more generally, aid in exploring a system’s behavior. While scenario inders are valuable for their ability to produce concrete examples, individual scenarios only give insight into what is possible, leaving the user to make their own conclusions about what might be necessary. This paper enriches scenario inding by allowing users to ask łwhy?ž and łwhy not?ž questions about the examples they are given. We show how to distinguish parts of an example that cannot be consistently removed (or changed) from those that merely relect underconstraint in the speciication. In the former case we show how to determine which elements of the speciication and which other components of the example together explain the presence of such facts. This paper formalizes the act of computing provenance in scenario- inding. We present Amalgam, an extension of the popular Alloy scenario-inder, which implements these foundations and provides interactive exploration of examples. We also evaluate Amalgam’s algorithmics on a variety of both textbook and real-world examples.
more »
« less
- Award ID(s):
- 1714431
- PAR ID:
- 10055775
- Date Published:
- Journal Name:
- Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering
- Page Range / eLocation ID:
- 106 to 116
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Inclusive design appears rarely, if at all, in most undergraduate computer science (CS) curricula. As a result, many CS students graduate without knowing how to apply inclusive design to the software they build, and go on to careers that perpetuate the proliferation of software that excludes communities of users. Our panel of CS faculty will explain how we have been working to address this problem. For the past several years, we have been integrating bits of inclusive design in multiple courses in CS undergraduate programs, which has had very positive impacts on students' ratings of their instructors, students' ratings of the education climate, and students' retention. The panel's content will be mostly concrete examples of how we are doing this so that attendees can leave with an in-the-trenches understanding of what this looks like for CS faculty across specialization areas and classes. Wemore » « less
-
Many data analysis and design problems involve reasoning about points in high-dimensional space. A common strategy is to embed points from this high-dimensional space into a low-dimensional one. As we will show in this paper, a critical property of good embeddings is that they preserve isometry — i.e., preserving the geodesic distance between points on the original data manifold within their embedded locations in the latent space. However, enforcing isometry is non-trivial for common Neural embedding models, such as autoencoders and generative models. Moreover, while theoretically appealing, it is not clear to what extent enforcing isometry is really necessary for a given design or analysis task. This paper answers these questions by constructing an isometric embedding via an isometric autoencoder, which we employ to analyze an inverse airfoil design problem. Specifically, the paper describes how to train an isometric autoencoder and demonstrates its usefulness compared to non-isometric autoencoders on both simple pedagogical examples and for airfoil embeddings using the UIUC airfoil dataset. Our ablation study illustrates that enforcing isometry is necessary to accurately discover latent space clusters — a common analysis method researchers typically perform on low-dimensional embeddings. We also show how isometric autoencoders can uncover pathologies in typical gradient-based Shape Optimization solvers through an analysis on the SU2-optimized airfoil dataset, wherein we find an over-reliance of the gradient solver on angle of attack. Overall, this paper motivates the use of isometry constraints in Neural embedding models, particularly in cases where researchers or designer intend to use distance-based analysis measures (such as clustering, k-Nearest Neighbors methods, etc.) to analyze designs within the latent space. While this work focuses on airfoil design as an illustrative example, it applies to any domain where analyzing isometric design or data embeddings would be useful.more » « less
-
Abstract Topological data analysis (TDA) is a tool from data science and mathematics that is beginning to make waves in environmental science. In this work, we seek to provide an intuitive and understandable introduction to a tool from TDA that is particularly useful for the analysis of imagery, namely, persistent homology. We briefly discuss the theoretical background but focus primarily on understanding the output of this tool and discussing what information it can glean. To this end, we frame our discussion around a guiding example of classifying satellite images from the sugar, fish, flower, and gravel dataset produced for the study of mesoscale organization of clouds by Rasp et al. We demonstrate how persistent homology and its vectorization, persistence landscapes, can be used in a workflow with a simple machine learning algorithm to obtain good results, and we explore in detail how we can explain this behavior in terms of image-level features. One of the core strengths of persistent homology is how interpretable it can be, so throughout this paper we discuss not just the patterns we find but why those results are to be expected given what we know about the theory of persistent homology. Our goal is that readers of this paper will leave with a better understanding of TDA and persistent homology, will be able to identify problems and datasets of their own for which persistent homology could be helpful, and will gain an understanding of the results they obtain from applying the included GitHub example code. Significance StatementInformation such as the geometric structure and texture of image data can greatly support the inference of the physical state of an observed Earth system, for example, in remote sensing to determine whether wildfires are active or to identify local climate zones. Persistent homology is a branch of topological data analysis that allows one to extract such information in an interpretable way—unlike black-box methods like deep neural networks. The purpose of this paper is to explain in an intuitive manner what persistent homology is and how researchers in environmental science can use it to create interpretable models. We demonstrate the approach to identify certain cloud patterns from satellite imagery and find that the resulting model is indeed interpretable.more » « less
-
Introduction: Learning standards are a crucial determinant of computer science (CS) education at the K-12 level, but they are not often researched despite their importance. We sought to address this gap with a mixed-methods study examining state and national K-12 CS standards. Research Question: What are the similarities and differences between state and national computer science standards? Methods: We tagged the state CS standards (n = 9695) according to their grade band/level, topic, course, and similarity to a Computer Science Teachers Association (CSTA) standard. We also analyzed the content of standards similar to CSTA standards to determine their topics, cognitive complexity, and other features. Results: We found some commonalities amidst broader diversity in approaches to organization and content across the states, relative to the CSTA standards. The content analysis showed that a common difference between state and CSTA standards is that the state standards tend to include concrete examples. We also found differences across states in how similar their standards are to CSTA standards, as well as differences in how cognitively complex the standards are. Discussion: Standards writers face many tensions and trade-offs, and this analysis shows how – in general terms – various states have chosen to manage those trade-offs in writing standards. For example, adding examples can improve clarity and specificity, but perhaps at the cost of brevity and longevity. A better understanding of the landscape of state standards can assist future standards writers, curriculum developers, and researchers in their work.more » « less
An official website of the United States government

