skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Etna: An Evaluation Platform for Property-Based Testing (Experience Report)
Property-based testing is a mainstay of functional programming, boasting a rich literature, an enthusiastic user community, and an abundance of tools — so many, indeed, that new users may have difficulty choosing. Moreover, any given framework may support a variety of strategies for generating test inputs; even experienced users may wonder which are better in a given situation. Sadly, the PBT literature, though long on creativity, is short on rigorous comparisons to help answer such questions. We present Etna, a platform for empirical evaluation and comparison of PBT techniques. Etna incorporates a number of popular PBT frameworks and testing workloads from the literature, and its extensible architecture makes adding new ones easy, while handling the technical drudgery of performance measurement. To illustrate its benefits, we use Etna to carry out several experiments with popular PBT approaches in both Coq and Haskell, allowing users to more clearly understand best practices and tradeoffs.  more » « less
Award ID(s):
2145649 1955610
PAR ID:
10601821
Author(s) / Creator(s):
 ;  ;  ;  ;  
Publisher / Repository:
Association for Computing Machinery (ACM)
Date Published:
Journal Name:
Proceedings of the ACM on Programming Languages
Volume:
7
Issue:
ICFP
ISSN:
2475-1421
Format(s):
Medium: X Size: p. 878-894
Size(s):
p. 878-894
Sponsoring Org:
National Science Foundation
More Like this
  1. Gibbons, Jeremy (Ed.)
    CONTEXT The success of QuickCheck has led to the development of property-based testing (PBT) libraries for many languages and the process is getting increasing attention. However, unlike regular testing, PBT is not widespread in collegiate curricula. Furthermore, the value of PBT is not limited to software testing. The growing use of formal methods in, and the growth of software synthesis, all create demand for techniques to train students and developers in the art of specification writing. We posit that PBT forms a strong bridge between testing and the act of specification: it’s a form of testing where the tester is actually writing abstract specifications. INQUIRY Even well-informed technologists mention the difficulty of finding good motivating examples for its use. We take steps to fill this lacuna. APPROACH & KNOWLEDGE We find that the use of “relational” problems—those for which an input may admit multiple valid outputs—easily motivates the use of PBT. We also notice that such problems are readily available in the computer science pantheon of problems (e.g., many graph and sorting algorithms). We have been using these for some years now to teach PBT in collegiate courses. GROUNDING In this paper, we describe the problems we use and report on students’ completion of them. We believe the problems overcome some of the motivation issues described above. We also show that students can do quite well at PBT for these problems, suggesting that the topic is well within their reach. In the process, we introduce a simple method to evaluate the accuracy of their specifications, and use it to characterize their common mistakes. IMPORTANCE Based on our findings, we believe that relational problems are an underutilized motivating example for PBT. We hope this paper initiates a catalog of such problems for educators (and developers) to use, and also provides a concrete (though by no means exclusive) method to analyze the quality of PBT. 
    more » « less
  2. Software developers increasingly rely on automated methods to assess the correctness of their code. One such method is property-based testing (PBT), wherein a test harness generates hundreds or thousands of inputs and checks the outputs of the program on those inputs using parametric properties. Though powerful, PBT induces a sizable gulf of evaluation: developers need to put in nontrivial effort to understand how well the different test inputs exercise the software under test. To bridge this gulf, we propose Tyche, a user interface that supports sensemaking around the effectiveness of property-based tests. Guided by a formative design exploration, our design of Tyche supports developers with interactive, configurable views of test behavior with tight integrations into modern developer testing workflow. These views help developers explore global testing behavior and individual test inputs alike. To accelerate the development of powerful, interactive PBT tools, we define a standard for PBT test reporting and integrate it with a widely used PBT library. A self-guided online usability study revealed that Tyche’s visualizations help developers to more accurately assess software testing effectiveness. 
    more » « less
  3. Abstract We introduce PhyloJunction, a computational framework designed to facilitate the prototyping, testing, and characterization of evolutionary models. PhyloJunction is distributed as an open-source Python library that can be used to implement a variety of models, thanks to its flexible graphical modeling architecture and dedicated model specification language. Model design and use are exposed to users via command-line and graphical interfaces, which integrate the steps of simulating, summarizing, and visualizing data. This article describes the features of PhyloJunction—which include, but are not limited to, a general implementation of a popular family of phylogenetic diversification models—and, moving forward, how it may be expanded to not only include new models, but to also become a platform for conducting and teaching statistical learning. 
    more » « less
  4. Although persistent, bioaccumulative and toxic pollutants (PBTs) are well-studied individually, their distribution and variability on a global scale are largely unknown, particularly in marine fish. Using 2,662 measurements collected from peer-reviewed literature spanning 1969–2012, we examined variability of five classes of PBTs, considering effects of geography, habitat, and trophic level on observed concentrations. While we see large-scale spatial patterning in some PBTs (chlordanes, polychlorinated biphenyls), habitat type and trophic level did not contribute to significant patterning, with the exception of mercury. We further examined patterns of change in PBT concentration as a function of sampling year. All PBTs showed significant declines in concentration levels through time, ranging from 15–30% reduction per decade across PBT groups. Despite consistent evidence of reductions, variation in pollutant concentration remains high, indicating ongoing consumer risk of exposure to fish with pollutant levels exceeding EPA screening values. The temporal trends indicate that mitigation programs are effective, but that global levels decline slowly. In order for monitoring efforts to provide more targeted assessments of risk to PBT exposure, these data highlight an urgent need for improved replication and standardization of pollutant monitoring protocols for marine finfish. 
    more » « less
  5. null (Ed.)
    Determinantal point processes (DPPs) are popular probabilistic models of diversity. In this paper, we investigate DPPs from a new perspective: property testing of distributions. Given sample access to an unknown distribution q over the subsets of a ground set, we aim to distinguish whether q is a DPP distribution or ϵ-far from all DPP distributions in ℓ1-distance. In this work, we propose the first algorithm for testing DPPs. Furthermore, we establish a matching lower bound on the sample complexity of DPP testing. This lower bound also extends to showing a new hardness result for the problem of testing the more general class of log-submodular distributions 
    more » « less