skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Semisynthetic simulation for microbiome data analysis
Abstract High-throughput sequencing data lie at the heart of modern microbiome research. Effective analysis of these data requires careful preprocessing, modeling, and interpretation to detect subtle signals and avoid spurious associations. In this review, we discuss how simulation can serve as a sandbox to test candidate approaches, creating a setting that mimics real data while providing ground truth. This is particularly valuable for power analysis, methods benchmarking, and reliability analysis. We explain the probability, multivariate analysis, and regression concepts behind modern simulators and how different implementations make trade-offs between generality, faithfulness, and controllability. Recognizing that all simulators only approximate reality, we review methods to evaluate how accurately they reflect key properties. We also present case studies demonstrating the value of simulation in differential abundance testing, dimensionality reduction, network analysis, and data integration. Code for these examples is available in an online tutorial (https://go.wisc.edu/8994yz) that can be easily adapted to new problem settings.  more » « less
Award ID(s):
1846216 2113754
PAR ID:
10571032
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Briefings in Bioinformatics
Volume:
26
Issue:
1
ISSN:
1467-5463
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Robot-assisted surgeries (RAS) have an extremely steep learning curve. Because of this, surgeons have created many methods to practice RAS outside the operating room. These training models usually include animal or plastic models; however, extended reality simulators have recently been introduced into surgical training programs. This systematic review and meta-analysis was conducted to determine if extended reality simulators can improve the performance of robotic novices and how their performance compares to the conventional training of surgeons on surgical robots. Using the PRISMA 2020 guidelines, a systematic review was performed searching PubMed, Embase, Web of Science, and Cochrane library for studies that compared the performance of robotic novices that received no additional training, trained with extended reality, or trained with inanimate physical simulators (conventional additional training). Articles that gauged performance using GEARS or time to complete measurements were included, while articles that did not make this comparison were excluded. A meta-analysis was performed on the 15 studies found using SPSS to compare the performance outcomes of the novices after training. Robotic novices trained with extended reality simulators showed a statistically significant improvement in time to complete (Cohen’s d = −0.95,p = 0.02) compared to those with no additional training. Extended reality training also showed no statistically significant difference in performance in time to complete (Cohen’s d = 0.65,p = 0.14) or GEARS scores (Cohen’s d = −0.093, p = 0.34) compared to robotic novices trained with conventional models. This meta-analysis seeks to determine if extended reality simulators translate complex skills to surgeons in a low-cost and low-risk environment. 
    more » « less
  2. Abstract BackgroundStatistical geneticists employ simulation to estimate the power of proposed studies, test new analysis tools, and evaluate properties of causal models. Although there are existing trait simulators, there is ample room for modernization. For example, most phenotype simulators are limited to Gaussian traits or traits transformable to normality, while ignoring qualitative traits and realistic, non-normal trait distributions. Also, modern computer languages, such as Julia, that accommodate parallelization and cloud-based computing are now mainstream but rarely used in older applications. To meet the challenges of contemporary big studies, it is important for geneticists to adopt new computational tools. ResultsWe present , an open-source Julia package that makes it trivial to quickly simulate phenotypes under a variety of genetic architectures. This package is integrated into our OpenMendel suite for easy downstream analyses. Julia was purpose-built for scientific programming and provides tremendous speed and memory efficiency, easy access to multi-CPU and GPU hardware, and to distributed and cloud-based parallelization. is designed to encourage flexible trait simulation, including via the standard devices of applied statistics, generalized linear models (GLMs) and generalized linear mixed models (GLMMs). also accommodates many study designs: unrelateds, sibships, pedigrees, or a mixture of all three. (Of course, for data with pedigrees or cryptic relationships, the simulation process must include the genetic dependencies among the individuals.) We consider an assortment of trait models and study designs to illustrate integrated simulation and analysis pipelines. Step-by-step instructions for these analyses are available in our electronic Jupyter notebooks on Github. These interactive notebooks are ideal for reproducible research. ConclusionThe package has three main advantages. (1) It leverages the computational efficiency and ease of use of Julia to provide extremely fast, straightforward simulation of even the most complex genetic models, including GLMs and GLMMs. (2) It can be operated entirely within, but is not limited to, the integrated analysis pipeline of OpenMendel. And finally (3), by allowing a wider range of more realistic phenotype models, brings power calculations and diagnostic tools closer to what investigators might see in real-world analyses. 
    more » « less
  3. Self-supervised learning (SSL) is at the core of training modern large machine learning models, providing a scheme for learning powerful representations that can be used in a variety of downstream tasks. However, SSL strategies must be adapted to the type of training data and downstream tasks required. We propose resimulation-based self-supervised representation learning (RS3L), a novel simulation-based SSL strategy that employs a method of to drive data augmentation for contrastive learning in the physical sciences, particularly, in fields that rely on stochastic simulators. By intervening in the middle of the simulation process and rerunning simulation components downstream of the intervention, we generate multiple realizations of an event, thus producing a set of augmentations covering all physics-driven variations available in the simulator. Using experiments from high-energy physics, we explore how this strategy may enable the development of a foundation model; we show how RS3L pretraining enables powerful performance in downstream tasks such as discrimination of a variety of objects and uncertainty mitigation. In addition to our results, we make the RS3L dataset publicly available for further studies on how to improve SSL strategies. Published by the American Physical Society2025 
    more » « less
  4. Abstract The unification of general relativity and quantum theory is one of the fascinating problems of modern physics. One leading solution is Loop Quantum Gravity (LQG). Simulating LQG may be important for providing predictions which can then be tested experimentally. However, such complex quantum simulations cannot run efficiently on classical computers, and quantum computers or simulators are needed. Here, we experimentally demonstrate quantum simulations of spinfoam amplitudes of LQG on an integrated photonics quantum processor. We simulate a basic transition of LQG and show that the derived spinfoam vertex amplitude falls within 4% error with respect to the theoretical prediction, despite experimental imperfections. We also discuss how to generalize the simulation for more complex transitions, in realistic experimental conditions, which will eventually lead to a quantum advantage demonstration as well as expand the toolbox to investigate LQG. 
    more » « less
  5. There are various applications of Cyber-Physical systems (CPSs) that are life-critical where failure or malfunction can result in significant harm to human life, the environment, or substantial economic loss. Therefore, it is important to ensure their reliability, security, and robustness to the attacks. However, there is no widely used toolbox to simulate CPS and target security problems, especially the simulation of sensor attacks and defense strategies against them. In this work, we introduce our toolbox CPSim, a user-friendly simulation toolbox for security problems in CPS. CPSim aims to simulate common sensor attacks and countermeasures to these sensor attacks. We have implemented bias attacks, delay attacks, and replay attacks. Additionally, we have implemented various recovery-based methods against sensor attacks. The sensor attacks and recovery methods configurations can be customized with the given APIs. CPSim has built-in numerical simulators and various implemented benchmarks. Moreover, CPSim is compatible with other external simulators and can be deployed on a real testbed for control purposes.1 
    more » « less