skip to main content

Title: The frontier of simulation-based inference

Many domains of science have developed complex simulations to describe phenomena of interest. While these simulations provide high-fidelity models, they are poorly suited for inference and lead to challenging inverse problems. We review the rapidly developing field of simulation-based inference and identify the forces giving additional momentum to the field. Finally, we describe how the frontier is expanding so that a broad audience can appreciate the profound influence these developments may have on science.

; ;
Award ID(s):
1836650 1841471 1450310 1806738
Publication Date:
Journal Name:
Proceedings of the National Academy of Sciences
Page Range or eLocation-ID:
Article No. 201912789
Proceedings of the National Academy of Sciences
Sponsoring Org:
National Science Foundation
More Like this
  1. Agent-based models provide a flexible framework that is frequently used for modelling many biological systems, including cell migration, molecular dynamics, ecology and epidemiology. Analysis of the model dynamics can be challenging due to their inherent stochasticity and heavy computational requirements. Common approaches to the analysis of agent-based models include extensive Monte Carlo simulation of the model or the derivation of coarse-grained differential equation models to predict the expected or averaged output from the agent-based model. Both of these approaches have limitations, however, as extensive computation of complex agent-based models may be infeasible, and coarse-grained differential equation models can fail tomore »accurately describe model dynamics in certain parameter regimes. We propose that methods from the equation learning field provide a promising, novel and unifying approach for agent-based model analysis. Equation learning is a recent field of research from data science that aims to infer differential equation models directly from data. We use this tutorial to review how methods from equation learning can be used to learn differential equation models from agent-based model simulations. We demonstrate that this framework is easy to use, requires few model simulations, and accurately predicts model dynamics in parameter regions where coarse-grained differential equation models fail to do so. We highlight these advantages through several case studies involving two agent-based models that are broadly applicable to biological phenomena: a birth–death–migration model commonly used to explore cell biology experiments and a susceptible–infected–recovered model of infectious disease spread.« less
  2. Abstract Expert testimony varies in scientific quality and jurors have a difficult time evaluating evidence quality (McAuliff et al., 2009). In the current study, we apply Fuzzy Trace Theory principles, examining whether visual and gist aids help jurors calibrate to the strength of scientific evidence. Additionally we were interested in the role of jurors’ individual differences in scientific reasoning skills in their understanding of case evidence. Contrary to our preregistered hypotheses, there was no effect of evidence condition or gist aid on evidence understanding. However, individual differences between jurors’ numeracy skills predicted evidence understanding. Summary Poor-quality expert evidence is sometimesmore »admitted into court (Smithburn, 2004). Jurors’ calibration to evidence strength varies widely and is not robustly understood. For instance, previous research has established jurors lack understanding of the role of control groups, confounds, and sample sizes in scientific research (McAuliff, Kovera, & Nunez, 2009; Mill, Gray, & Mandel, 1994). Still others have found that jurors can distinguish weak from strong evidence when the evidence is presented alone, yet not when simultaneously presented with case details (Smith, Bull, & Holliday, 2011). This research highlights the need to present evidence to jurors in a way they can understand. Fuzzy Trace Theory purports that people encode information in exact, verbatim representations and through “gist” representations, which represent summary of meaning (Reyna & Brainerd, 1995). It is possible that the presenting complex scientific evidence to people with verbatim content or appealing to the gist, or bottom-line meaning of the information may influence juror understanding of that evidence. Application of Fuzzy Trace Theory in the medical field has shown that gist representations are beneficial for helping laypeople better understand risk and benefits of medical treatment (Brust-Renck, Reyna, Wilhelms, & Lazar, 2016). Yet, little research has applied Fuzzy Trace Theory to information comprehension and application within the context of a jury (c.f. Reyna et. al., 2015). Additionally, it is likely that jurors’ individual characteristics, such as scientific reasoning abilities and cognitive tendencies, influence their ability to understand and apply complex scientific information (Coutinho, 2006). Methods The purpose of this study was to examine how jurors calibrate to the strength of scientific information, and whether individual difference variables and gist aids inspired by Fuzzy Trace Theory help jurors better understand complicated science of differing quality. We used a 2 (quality of scientific evidence: high vs. low) x 2 (decision aid to improve calibration - gist information vs. no gist information), between-subjects design. All hypotheses were preregistered on the Open Science Framework. Jury-eligible community participants (430 jurors across 90 juries; Mage = 37.58, SD = 16.17, 58% female, 56.93% White). Each jury was randomly assigned to one of the four possible conditions. Participants were asked to individually fill out measures related to their scientific reasoning skills prior to watching a mock jury trial. The trial was about an armed bank robbery and consisted of various pieces of testimony and evidence (e.g. an eyewitness testimony, police lineup identification, and a sweatshirt found with the stolen bank money). The key piece of evidence was mitochondrial DNA (mtDNA) evidence collected from hair on a sweatshirt (materials from Hans et al., 2011). Two experts presented opposing opinions about the scientific evidence related to the mtDNA match estimate for the defendant’s identification. The quality and content of this mtDNA evidence differed based on the two conditions. The high quality evidence condition used a larger database than the low quality evidence to compare to the mtDNA sample and could exclude a larger percentage of people. In the decision aid condition, experts in the gist information group presented gist aid inspired visuals and examples to help explain the proportion of people that could not be excluded as a match. Those in the no gist information group were not given any aid to help them understand the mtDNA evidence presented. After viewing the trial, participants filled out a questionnaire on how well they understood the mtDNA evidence and their overall judgments of the case (e.g. verdict, witness credibility, scientific evidence strength). They filled this questionnaire out again after a 45-minute deliberation. Measures We measured Attitudes Toward Science (ATS) with indices of scientific promise and scientific reservations (Hans et al., 2011; originally developed by National Science Board, 2004; 2006). We used Drummond and Fischhoff’s (2015) Scientific Reasoning Scale (SRS) to measure scientific reasoning skills. Weller et al.’s (2012) Numeracy Scale (WNS) measured proficiency in reasoning with quantitative information. The NFC-Short Form (Cacioppo et al., 1984) measured need for cognition. We developed a 20-item multiple-choice comprehension test for the mtDNA scientific information in the cases (modeled on Hans et al., 2011, and McAuliff et al., 2009). Participants were shown 20 statements related to DNA evidence and asked whether these statements were True or False. The test was then scored out of 20 points. Results For this project, we measured calibration to the scientific evidence in a few different ways. We are building a full model with these various operationalizations to be presented at APLS, but focus only on one of the calibration DVs (i.e., objective understanding of the mtDNA evidence) in the current proposal. We conducted a general linear model with total score on the mtDNA understanding measure as the DV and quality of scientific evidence condition, decision aid condition, and the four individual difference measures (i.e., NFC, ATS, WNS, and SRS) as predictors. Contrary to our main hypotheses, neither evidence quality nor decision aid condition affected juror understanding. However, the individual difference variables did: we found significant main effects for Scientific Reasoning Skills, F(1, 427) = 16.03, p <.001, np2 = .04, Weller Numeracy Scale, F(1, 427) = 15.19, p <.001, np2 = .03, and Need for Cognition, F(1, 427) = 16.80, p <.001, np2 = .04, such that those who scored higher on these measures displayed better understanding of the scientific evidence. In addition there was a significant interaction of evidence quality condition and scores on the Weller’s Numeracy Scale, F(1, 427) = 4.10, p = .04, np2 = .01. Further results will be discussed. Discussion These data suggest jurors are not sensitive to differences in the quality of scientific mtDNA evidence, and also that our attempt at helping sensitize them with Fuzzy Trace Theory-inspired aids did not improve calibration. Individual scientific reasoning abilities and general cognition styles were better predictors of understanding this scientific information. These results suggest a need for further exploration of approaches to help jurors differentiate between high and low quality evidence. Note: The 3rd author was supported by an AP-LS AP Award for her role in this research. Learning Objective: Participants will be able to describe how individual differences in scientific reasoning skills help jurors understand complex scientific evidence.« less
  3. Abstract

    In the field of beam physics, two frontier topics have taken center stage due to their potential to enable new approaches to discovery in a wide swath of science. These areas are: advanced, high gradient acceleration techniques, and x-ray free electron lasers (XFELs). Further, there is intense interest in the marriage of these two fields, with the goal of producing a very compact XFEL. In this context, recent advances in high gradient radio-frequency cryogenic copper structure research have opened the door to the use of surface electric fields between 250 and 500 MV m−1. Such an approach is foreseenmore »to enable a new generation of photoinjectors with six-dimensional beam brightness beyond the current state-of-the-art by well over an order of magnitude. This advance is an essential ingredient enabling an ultra-compact XFEL (UC-XFEL). In addition, one may accelerate these bright beams to GeV scale in less than 10 m. Such an injector, when combined with inverse free electron laser-based bunching techniques can produce multi-kA beams with unprecedented beam quality, quantified by 50 nm-rad normalized emittances. The emittance, we note, is the effective area in transverse phase space (x,px/mec) or (y,py/mec) occupied by the beam distribution, and it is relevant to achievable beam sizes as well as setting a limit on FEL wavelength. These beams, when injected into innovative, short-period (1–10 mm) undulators uniquely enable UC-XFELs having footprints consistent with university-scale laboratories. We describe the architecture and predicted performance of this novel light source, which promises photon production per pulse of a few percent of existing XFEL sources. We review implementation issues including collective beam effects, compact x-ray optics systems, and other relevant technical challenges. To illustrate the potential of such a light source to fundamentally change the current paradigm of XFELs with their limited access, we examine possible applications in biology, chemistry, materials, atomic physics, industry, and medicine—including the imaging of virus particles—which may profit from this new model of performing XFEL science.

    « less
  4. Windecker, Saras (Ed.)
    1. The ecological and environmental science communities have embraced machine learning (ML) for empirical modelling and prediction. However, going beyond prediction to draw insights into underlying functional relationships between response variables and environmental ‘drivers’ is less straightforward. Deriving ecological insights from fitted ML models requires techniques to extract the ‘learning’ hidden in the ML models. 2. We revisit the theoretical background and effectiveness of four approaches for deriving insights from ML: ranking independent variable importance (Gini importance, GI; permutation importance, PI; split importance, SI; and conditional permutation importance, CPI), and two approaches for inference of bivariate functional relationships (partial dependencemore »plots, PDP; and accumulated local effect plots, ALE). We also explore the use of a surrogate model for visualization and interpretation of complex multi-variate relationships between response variables and environmental drivers. We examine the challenges and opportunities for extracting ecological insights with these interpretation approaches. Specifically, we aim to improve interpretation of ML models by investigating how effectiveness relates to (a) interpretation algorithm, (b) sample size and (c) the presence of spurious explanatory variables. 3. We base the analysis on simulations with known underlying functional relationships between response and predictor variables, with added white noise and the presence of correlated but non-influential variables. The results indicate that deriving ecological insight is strongly affected by interpretation algorithm and spurious variables, and moderately impacted by sample size. Removing spurious variables improves interpretation of ML models. Meanwhile, increasing sample size has limited value in the presence of spurious variables, but increasing sample size does improves performance once spurious variables are omitted. Among the four ranking methods, SI is slightly more effective than the other methods in the presence of spurious variables, while GI and SI yield higher accuracy when spurious variables are removed. PDP is more effective in retrieving underlying functional relationships than ALE, but its reliability declines sharply in the presence of spurious variables. Visualization and interpretation of the interactive effects of predictors and the response variable can be enhanced using surrogate models, including three-dimensional visualizations and use of loess planes to represent independent variable effects and interactions. 4. Machine learning analysts should be aware that including correlated independent variables in ML models with no clear causal relationship to response variables can interfere with ecological inference. When ecological inference is important, ML models should be constructed with independent variables that have clear causal effects on response variables. While interpreting ML models for ecological inference remains challenging, we show that careful choice of interpretation methods, exclusion of spurious variables and adequate sample size can provide more and better opportunities to ‘learn from machine learning’.« less
  5. We present a new prospective analysis of deep multi-band imaging with the James Webb Space Telescope (JWST). In this work, we investigate the recovery of high-redshift 5 <   z  <  12 galaxies through extensive image simulations of accepted JWST programs, including the Early Release Science in the EGS field and the Guaranteed Time Observations in the HUDF. We introduced complete samples of ∼300 000 galaxies with stellar masses of log( M * / M ⊙ ) > 6 and redshifts of 0 <   z  <  15, as well as galactic stars, into realistic mock NIRCam, MIRI, and HST images tomore »properly describe the impact of source blending. We extracted the photometry of the detected sources, as in real images, and estimated the physical properties of galaxies through spectral energy distribution fitting. We find that the photometric redshifts are primarily limited by the availability of blue-band and near-infrared medium-band imaging. The stellar masses and star formation rates are recovered within 0.25 and 0.3 dex, respectively, for galaxies with accurate photometric redshifts. Brown dwarfs contaminating the z  >  5 galaxy samples can be reduced to < 0.01 arcmin −2 with a limited impact on galaxy completeness. We investigate multiple high-redshift galaxy selection techniques and find that the best compromise between completeness and purity at 5 <   z  <  10 using the full redshift posterior probability distributions. In the EGS field, the galaxy completeness remains higher than 50% at magnitudes m UV  <  27.5 and at all redshifts, and the purity is maintained above 80 and 60% at z  ≤ 7 and 10, respectively. The faint-end slope of the galaxy UV luminosity function is recovered with a precision of 0.1–0.25, and the cosmic star formation rate density within 0.1 dex. We argue in favor of additional observing programs covering larger areas to better constrain the bright end.« less