skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 10:00 PM ET on Friday, February 6 until 10:00 AM ET on Saturday, February 7 due to maintenance. We apologize for the inconvenience.


Title: Verify Engineering Models, Not Scientific Models
Driving progress in science and engineering for centuries, models are powerful tools for understanding systems and building abstractions. However, the goal of models in science is different from that in engineering, and we observe the misuse of models undermining research goals. Specifically in the field of formal methods, we advocate that verification should be performed on engineering models rather than scientific models, to the extent possible. We observe that models under verification are, very often, scientific models rather than engineering models, and we show why verifying scientific models is ineffective in engineering efforts. To guarantee safety in an engineered system, it is the engineering model one should verify. This model can be used to derive a correct-by-construction implementation. To demonstrate our proposed principle, we review lessons learned from verifying programs in a language called Lingua Franca using Timed Rebeca.  more » « less
Award ID(s):
2233769
PAR ID:
10663125
Author(s) / Creator(s):
;
Editor(s):
Lee, EA; Mousavi, MR; Talcott, C
Publisher / Repository:
Springer Nature Switzerland
Date Published:
Page Range / eLocation ID:
195 to 211
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Verification activities increase an engineering team’s confidence in its system design meeting system requirements, which in turn are derived from stakeholder needs. Conventional wisdom suggests that the system design should be verified frequently to minimize the cost of rework as the system design matures. However, this strategy is based more on experience of engineers than on a theoretical foundation. In this paper, we develop a belief-based model of verification of system design, using a single system requirement as an abstraction, to determine the conditions under which it is cost effective for an organization to verify frequently. We study the model for a broad set of growth rates in verification setup and rework costs. Our results show that verifying a system design frequently is not always an optimal verification strategy. Instead, it is only an optimal strategy when the costs of reworking a faulty design increase at a certain rate as the design matures. 
    more » « less
  2. Abstract PremiseThe selection ofArabidopsisas a model organism played a pivotal role in advancing genomic science. The competing frameworks to select an agricultural‐ or ecological‐based model species were rejected, in favor of building knowledge in a species that would facilitate genome‐enabled research. MethodsHere, we examine the ability of models based onArabidopsisgene expression data to predict tissue identity in other flowering plants. Comparing different machine learning algorithms, models trained and tested onArabidopsisdata achieved near perfect precision and recall values, whereas when tissue identity is predicted across the flowering plants using models trained onArabidopsisdata, precision values range from 0.69 to 0.74 and recall from 0.54 to 0.64. ResultsThe identity of belowground tissue can be predicted more accurately than other tissue types, and the ability to predict tissue identity is not correlated with phylogenetic distance fromArabidopsis.k‐nearest neighbors is the most successful algorithm, suggesting that gene expression signatures, rather than marker genes, are more valuable to create models for tissue and cell type prediction in plants. DiscussionOur data‐driven results highlight that the assertion that knowledge fromArabidopsisis translatable to other plants is not always true. Considering the current landscape of abundant sequencing data, we should reevaluate the scientific emphasis onArabidopsisand prioritize plant diversity. 
    more » « less
  3. Dynamical systems that evolve continuously over time are ubiquitous throughout science and engineering. Machine learning (ML) provides data-driven approaches to model and predict the dynamics of such systems. A core issue with this approach is that ML models are typically trained on discrete data, using ML methodologies that are not aware of underlying continuity properties. This results in models that often do not capture any underlying continuous dynamics—either of the system of interest, or indeed of any related system. To address this challenge, we develop a convergence test based on numerical analysis theory. Our test verifies whether a model has learned a function that accurately approximates an underlying continuous dynamics. Models that fail this test fail to capture relevant dynamics, rendering them of limited utility for many scientific prediction tasks; while models that pass this test enable both better interpolation and better extrapolation in multiple ways. Our results illustrate how principled numerical analysis methods can be coupled with existing ML training/testing methodologies to validate models for science and engineering applications. 
    more » « less
  4. Abstract Obtaining accurate estimates of machine learning model uncertainties on newly predicted data is essential for understanding the accuracy of the model and whether its predictions can be trusted. A common approach to such uncertainty quantification is to estimate the variance from an ensemble of models, which are often generated by the generally applicable bootstrap method. In this work, we demonstrate that the direct bootstrap ensemble standard deviation is not an accurate estimate of uncertainty but that it can be simply calibrated to dramatically improve its accuracy. We demonstrate the effectiveness of this calibration method for both synthetic data and numerous physical datasets from the field of Materials Science and Engineering. The approach is motivated by applications in physical and biological science but is quite general and should be applicable for uncertainty quantification in a wide range of machine learning regression models. 
    more » « less
  5. Miesenberger, K; Robles, A; Ruiz, S (Ed.)
    While 3D printing may be a promising tool for making Science, Technology, Engineering, and Mathematics (STEM) education more accessible for students with visual impairments, most research centers on creating and using tactile models and braille, rather than direct student use of 3D printing technologies. This study observed 121 high school studentswith visual impairments across twelve states, examining whether and how students with visual impairments engage in scientific and engineering practices during their assembling of a 3D printer. We found that students exhibited all eight of the science and engineering behaviors defined in the National Research Council's A Framework for K-12 Science Education:Practices, Crosscutting Concepts, and Core Ideas. This study builds upon the work of Hilson and Wild and shows that students with visual impairments, when given the opportunity, can demonstrate scientific and engineering process skills just as their sighted peers do. This is the largest sample of students with visual impairments to date to be observed to document their work and behaviors in this area of STEM research. However, further research is needed to examine science and engineering behaviors of students with visual impairments in other STEM areas and while completing other complex STEM tasks. 
    more » « less