skip to main content


Search for: All records

Award ID contains: 1901543

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Maintaining confidential information control in software is a persistent security problem where failure means secrets can be revealed via program behaviors. Information flow control techniques traditionally have been based on static or symbolic analyses — limited in scalability and specialized to particular languages. When programs do leak secrets there are no approaches to automatically repair them unless the leak causes a functional test to fail. We present our vision for HyperGI, a genetic improvement framework that detects, localizes and repairs information leakage. Key elements of HyperGI include (1) the use of two orthogonal test suites, (2) a dynamic leak detection approach which estimates and localizes potential leaks, and (3) a repair component that produces a candidate patch using genetic improvement. We demonstrate the successful use of HyperGI on several programs with no failing functional test cases. We manually examine the resulting patches and identify trade-offs and future directions for fully realizing our vision. 
    more » « less
  2. null (Ed.)
    In this paper we argue for using many partial test suites instead of one full test suite during program repair. This may provide a pool of simpler, yet correct patches, addressing both the overfitting and poor repair quality problem. To support this idea, we present some insight obtained running APR partial test suites on the well studied triangle program. 
    more » « less
  3. null (Ed.)
  4. null (Ed.)
    Abstract Software product line engineering is a best practice for managing reuse in families of software systems that is increasingly being applied to novel and emerging domains. In this work we investigate the use of software product line engineering in one of these new domains, synthetic biology. In synthetic biology living organisms are programmed to perform new functions or improve existing functions. These programs are designed and constructed using small building blocks made out of DNA. We conjecture that there are families of products that consist of common and variable DNA parts, and we can leverage product line engineering to help synthetic biologists build, evolve, and reuse DNA parts. In this paper we perform an investigation of domain engineering that leverages an open-source repository of more than 45,000 reusable DNA parts. We show the feasibility of these new types of product line models by identifying features and related artifacts in up to 93.5% of products, and that there is indeed both commonality and variability. We then construct feature models for four commonly engineered functions leading to product lines ranging from 10 to 7.5 × 10 20 products. In a case study we demonstrate how we can use the feature models to help guide new experimentation in aspects of application engineering. Finally, in an empirical study we demonstrate the effectiveness and efficiency of automated reverse engineering on both complete and incomplete sets of products. In the process of these studies, we highlight key challenges and uncovered limitations of existing SPL techniques and tools which provide a roadmap for making SPL engineering applicable to new and emerging domains. 
    more » « less
  5. Lal, Rup (Ed.)
    ABSTRACT Microbial metabolism and trophic interactions between microbes give rise to complex multispecies communities in microbe-host systems. Bacteroides thetaiotaomicron ( B. theta ) is a human gut symbiont thought to play an important role in maintaining host health. Untargeted nuclear magnetic resonance metabolomics revealed B. theta secretes specific organic acids and amino acids in defined minimal medium. Physiological concentrations of acetate and formate found in the human intestinal tract were shown to cause dose-dependent changes in secretion of metabolites known to play roles in host nutrition and pathogenesis. While secretion fluxes varied, biomass yield was unchanged, suggesting feedback inhibition does not affect metabolic bioenergetics but instead redirects carbon and energy to CO 2 and H 2 . Flux balance analysis modeling showed increased flux through CO 2 -producing reactions under glucose-limiting growth conditions. The metabolic dynamics observed for B. theta , a keystone symbiont organism, underscores the need for metabolic modeling to complement genomic predictions of microbial metabolism to infer mechanisms of microbe-microbe and microbe-host interactions. IMPORTANCE Bacteroides is a highly abundant taxon in the human gut, and Bacteroides thetaiotaomicron ( B. theta ) is a ubiquitous human symbiont that colonizes the host early in development and persists throughout its life span. The phenotypic plasticity of keystone organisms such as B. theta is important to understand in order to predict phenotype(s) and metabolic interactions under changing nutrient conditions such as those that occur in complex gut communities. Our study shows B. theta prioritizes energy conservation and suppresses secretion of “overflow metabolites” such as organic acids and amino acids when concentrations of acetate are high. Secreted metabolites, especially amino acids, can be a source of nutrients or signals for the host or other microbes in the community. Our study suggests that when metabolically stressed by acetate, B. theta stops sharing with its ecological partners. 
    more » « less
  6. Aleti A., Panichella A (Ed.)
    Users of highly-configurable software systems often want to optimize a particular objective such as improving a functional outcome or increasing system performance. One approach is to use an evolutionary algorithm. However, many applications today are data-driven, meaning they depend on inputs or data which can be complex and varied. Hence, a search needs to be run (and re-run) for all inputs, making optimization a heavy-weight and potentially impractical process. In this paper, we explore this issue on a data-driven highly-configurable scientific application. We build an exhaustive database containing 3,000 configurations and 10,000 inputs, leading to almost 100 million records as our oracle, and then run a genetic algorithm individually on each of the 10,000 inputs. We ask if (1) a genetic algorithm can find configurations to improve functional objectives; (2) whether patterns of best configurations over all input data emerge; and (3) if we can we use sampling to approximate the results. We find that the original (default) configuration is best only 34% of the time, while clear patterns emerge of other best configurations. Out of 3,000 possible configurations, only 112 distinct configurations achieve the optimal result at least once across all 10,000 inputs, suggesting the potential for lighter weight optimization approaches. We show that sampling of the input data finds similar patterns at a lower cost. 
    more » « less
  7. Software product line engineering is a best practice for managing reuse in families of software systems. In this work, we explore the use of product line engineering in the emerging programming domain of synthetic biology. In synthetic biology, living organisms are programmed to perform new functions or improve existing functions. These programs are designed and constructed using small building blocks made out of DNA. We conjecture that there are families of products that consist of common and variable DNA parts, and we can leverage product line engineering to help synthetic biologists build, evolve, and reuse these programs. As a first step towards this goal, we perform a domain engineering case study that leverages an open-source repository of more than 45,000 reusable DNA parts. We are able to identify features and their related artifacts, all of which can be composed to make different programs. We demonstrate that we can successfully build feature models representing families for two commonly engineered functions. We then analyze an existing synthetic biology case study and demonstrate how product line engineering can be beneficial in this domain. 
    more » « less
  8. In this paper we revisit the field of search-based software testing (SBST) in the context of its technological maturity. We highlight some successes with respect to tools, hybrid approaches, extensions and industry adoption. We then discuss some open challenges that remain for SBST including the need for new approaches to system testing, automated oracle generation, incorporating humans into the search process, and leveraging learning through hyper-heuristic search. 
    more » « less
  9. Android has rocketed to the top of the mobile market thanks in large part to its open source model. Vendors use Android for their devices for free, and companies make customizations to suit their needs. This has resulted in a myriad of configurations that are extant in the user space today. In this paper, we show that differences in configurations, if ignored, can lead to differences in test outputs and code coverage. Consequently, researchers who develop new testing techniques and evaluate them on only one or two configurations are missing a necessary dimension in their experiments and developers who ignore this may release buggy software. In a large study on 18 apps across 88 configurations, we show that only one of the 18 apps studied showed no variation at all. The rest showed variation in either, or both, code coverage and test results. 15% of the 2,000 plus test cases across all of the apps vary, and some of the variation is subtle, i.e. not just a test crash. Our results suggest that configurations in Android testing do matter and that developers need to test using configuration-aware techniques. 
    more » « less