skip to main content


Title: An empirical investigation of organic software product lines
Abstract Software product line engineering is a best practice for managing reuse in families of software systems that is increasingly being applied to novel and emerging domains. In this work we investigate the use of software product line engineering in one of these new domains, synthetic biology. In synthetic biology living organisms are programmed to perform new functions or improve existing functions. These programs are designed and constructed using small building blocks made out of DNA. We conjecture that there are families of products that consist of common and variable DNA parts, and we can leverage product line engineering to help synthetic biologists build, evolve, and reuse DNA parts. In this paper we perform an investigation of domain engineering that leverages an open-source repository of more than 45,000 reusable DNA parts. We show the feasibility of these new types of product line models by identifying features and related artifacts in up to 93.5% of products, and that there is indeed both commonality and variability. We then construct feature models for four commonly engineered functions leading to product lines ranging from 10 to 7.5 × 10 20 products. In a case study we demonstrate how we can use the feature models to help guide new experimentation in aspects of application engineering. Finally, in an empirical study we demonstrate the effectiveness and efficiency of automated reverse engineering on both complete and incomplete sets of products. In the process of these studies, we highlight key challenges and uncovered limitations of existing SPL techniques and tools which provide a roadmap for making SPL engineering applicable to new and emerging domains.  more » « less
Award ID(s):
1901543 1805528
NSF-PAR ID:
10274933
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Empirical Software Engineering
Volume:
26
Issue:
3
ISSN:
1382-3256
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Software product line engineering is a best practice for managing reuse in families of software systems. In this work, we explore the use of product line engineering in the emerging programming domain of synthetic biology. In synthetic biology, living organisms are programmed to perform new functions or improve existing functions. These programs are designed and constructed using small building blocks made out of DNA. We conjecture that there are families of products that consist of common and variable DNA parts, and we can leverage product line engineering to help synthetic biologists build, evolve, and reuse these programs. As a first step towards this goal, we perform a domain engineering case study that leverages an open-source repository of more than 45,000 reusable DNA parts. We are able to identify features and their related artifacts, all of which can be composed to make different programs. We demonstrate that we can successfully build feature models representing families for two commonly engineered functions. We then analyze an existing synthetic biology case study and demonstrate how product line engineering can be beneficial in this domain. 
    more » « less
  2. null (Ed.)
    Unmanned Aerial Vehicles (UAVs) are increasingly used by emergency responders to support search-and-rescue operations, medical supplies delivery, fire surveillance, and many other scenarios. At the same time, researchers are investigating usage scenarios in which UAVs are imbued with a greater level of autonomy to provide automated search, surveillance, and delivery capabilities that far exceed current adoption practices. To address this emergent opportunity, we are developing a configurable, multi-user, multi-UAV system for supporting the use of semi-autonomous UAVs in diverse emergency response missions. We present a requirements-driven approach for creating a software product line (SPL) of highly configurable scenarios based on different missions. We focus on the process for eliciting and modeling a family of related use cases, constructing individual feature models, and activity diagrams for each scenario, and then merging them into an SPL. We show how the SPL will be implemented through leveraging and augmenting existing features in our DroneResponse system. We further present a configuration tool, and demonstrate its ability to generate mission-specific configurations for 20 different use case scenarios. 
    more » « less
  3. Abstract

    Visualization of gene products inCaenorhabditis eleganshas provided insights into the molecular and biological functions of many novel genes in their native contexts. Single‐molecule fluorescencein situhybridization (smFISH) and immunofluorescence (IF) enable the visualization of the abundance and localization of mRNAs and proteins, respectively, allowing researchers to ultimately elucidate the localization, dynamics, and functions of the corresponding genes. Whereas both smFISH and immunofluorescence have been foundational techniques in molecular biology, each protocol poses challenges for use in theC. elegansembryo. smFISH protocols suffer from high initial costs and can photobleach rapidly, and immunofluorescence requires technically challenging permeabilization steps and slide preparation. Most importantly, published smFISH and IF protocols have predominantly been mutually exclusive, preventing the exploration of relationships between an mRNA and a relevant protein in the same sample. Here, we describe protocols to perform immunofluorescence and smFISH inC. elegansembryos either in sequence or simultaneously. We also outline the steps to perform smFISH or immunofluorescence alone, including several improvements and optimizations to existing approaches. These protocols feature improved fixation and permeabilization steps to preserve cellular morphology while maintaining probe and antibody accessibility in the embryo, a streamlined, in‐tube approach for antibody staining that negates freeze‐cracking, a validated method to perform the cost‐reducing single molecule inexpensive FISH (smiFISH) adaptation, slide preparation using empirically determined optimal antifade products, and straightforward quantification and data analysis methods. Finally, we discuss tricks and tips to help the reader optimize and troubleshoot individual steps in each protocol. Together, these protocols simplify existing workflows for single‐molecule RNA and protein detection. Moreover, simultaneous, high‐resolution imaging of proteins and RNAs of interest will permit analysis, quantification, and comparison of protein and RNA distributions, furthering our understanding of the relationship between RNAs and their protein products or cellular markers in early development. © 2021 Wiley Periodicals LLC.

    Basic Protocol 1: Sequential immunofluorescence and single‐molecule fluorescencein situhybridization

    Alternate Protocol: Abbreviated protocol for simultaneous immunofluorescence and single‐molecule fluorescencein situhybridization

    Basic Protocol 2: Simplified immunofluorescence inC. elegansembryos

    Basic Protocol 3: Single‐molecule fluorescencein situhybridization or single‐molecule inexpensive fluorescencein situhybridization

     
    more » « less
  4. An important long-term goal in machine learning systems is to build learning agents that, like humans, can learn many tasks over their lifetime, and moreover use information from these tasks to improve their ability to do so efficiently. In this work, our goal is to provide new theoretical insights into the potential of this paradigm. In particular, we propose a lifelong learning framework that adheres to a novel notion of resource efficiency that is critical in many real-world domains where feature evaluations are costly. That is, our learner aims to reuse information from previously learned related tasks to learn future tasks in a feature-efficient manner. Furthermore, we consider novel combinatorial ways in which learning tasks can relate. Specifically, we design lifelong learning algorithms for two structurally different and widely used families of target functions: decision trees/lists and monomials/polynomials. We also provide strong feature-efficiency guarantees for these algorithms; in fact, we show that in order to learn future targets, we need only slightly more feature evaluations per training example than what is needed to predict on an arbitrary example using those targets. We also provide algorithms with guarantees in an agnostic model where not all the targets are related to each other. Finally, we also provide lower bounds on the performance of a lifelong learner in these models, which are in fact tight under some conditions. 
    more » « less
  5. Ruby, Edward G. (Ed.)
    ABSTRACT

    A conspicuous roadblock to studying marine bacteria for fundamental research and biotechnology is a lack of modular synthetic biology tools for their genetic manipulation. Here, we applied, and generated new parts for, a modular plasmid toolkit to study marine bacteria in the context of symbioses and host-microbe interactions. To demonstrate the utility of this plasmid system, we genetically manipulated the marine bacteriumPseudoalteromonas luteoviolacea, which stimulates the metamorphosis of the model tubeworm,Hydroides elegans. Using these tools, we quantified constitutive and native promoter expression, developed reporter strains that enable the imaging of host-bacteria interactions, and used CRISPR interference (CRISPRi) to knock down a secondary metabolite and a host-associated gene. We demonstrate the broader utility of this modular system for testing the genetic tractability of marine bacteria that are known to be associated with diverse host-microbe symbioses. These efforts resulted in the successful conjugation of 12 marine strains from the Alphaproteobacteria and Gammaproteobacteria classes. Altogether, the present study demonstrates how synthetic biology strategies enable the investigation of marine microbes and marine host-microbe symbioses with potential implications for environmental restoration and biotechnology.

    IMPORTANCE

    Marine Proteobacteria are attractive targets for genetic engineering due to their ability to produce a diversity of bioactive metabolites and their involvement in host-microbe symbioses. Modular cloning toolkits have become a standard for engineering model microbes, such asEscherichia coli, because they enable innumerable mix-and-match DNA assembly and engineering options. However, such modular tools have not yet been applied to most marine bacterial species. In this work, we adapt a modular plasmid toolkit for use in a set of 12 marine bacteria from the Gammaproteobacteria and Alphaproteobacteria classes. We demonstrate the utility of this genetic toolkit by engineering a marinePseudoalteromonasbacterium to study their association with its host animalHydroides elegans. This work provides a proof of concept that modular genetic tools can be applied to diverse marine bacteria to address basic science questions and for biotechnology innovations.

     
    more » « less