skip to main content


Title: Learning to simulate high energy particle collisions from unlabeled data
Abstract In many scientific fields which rely on statistical inference, simulations are often used to map from theoretical models to experimental data, allowing scientists to test model predictions against experimental results. Experimental data is often reconstructed from indirect measurements causing the aggregate transformation from theoretical models to experimental data to be poorly-described analytically. Instead, numerical simulations are used at great computational cost. We introduce Optimal-Transport-based Unfolding and Simulation (OTUS), a fast simulator based on unsupervised machine-learning that is capable of predicting experimental data from theoretical models. Without the aid of current simulation information, OTUS trains a probabilistic autoencoder to transform directly between theoretical models and experimental data. Identifying the probabilistic autoencoder’s latent space with the space of theoretical models causes the decoder network to become a fast, predictive simulator with the potential to replace current, computationally-costly simulators. Here, we provide proof-of-principle results on two particle physics examples, Z -boson and top-quark decays, but stress that OTUS can be widely applied to other fields.  more » « less
Award ID(s):
2003237 2047418 2007719 1928718
NSF-PAR ID:
10329934
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Scientific Reports
Volume:
12
Issue:
1
ISSN:
2045-2322
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. High-resolution simulations can deliver great visual quality, but they are often limited by available memory, especially on GPUs. We present a compiler for physical simulation that can achieve both high performance and significantly reduced memory costs, by enabling flexible and aggressive quantization. Low-precision ("quantized") numerical data types are used and packed to represent simulation states, leading to reduced memory space and bandwidth consumption. Quantized simulation allows higher resolution simulation with less memory, which is especially attractive on GPUs. Implementing a quantized simulator that has high performance and packs the data tightly for aggressive storage reduction would be extremely labor-intensive and error-prone using a traditional programming language. To make the creation of quantized simulation practical, we have developed a new set of language abstractions and a compilation system. A suite of tailored domain-specific optimizations ensure quantized simulators often run as fast as the full-precision simulators, despite the overhead of encoding-decoding the packed quantized data types. Our programming language and compiler, based on Taichi , allow developers to effortlessly switch between different full-precision and quantized simulators, to explore the full design space of quantization schemes, and ultimately to achieve a good balance between space and precision. The creation of quantized simulation with our system has large benefits in terms of memory consumption and performance, on a variety of hardware, from mobile devices to workstations with high-end GPUs. We can simulate with levels of resolution that were previously only achievable on systems with much more memory, such as multiple GPUs. For example, on a single GPU, we can simulate a Game of Life with 20 billion cells (8× compression per pixel), an Eulerian fluid system with 421 million active voxels (1.6× compression per voxel), and a hybrid Eulerian-Lagrangian elastic object simulation with 235 million particles (1.7× compression per particle). At the same time, quantized simulations create physically plausible results. Our quantization techniques are complementary to existing acceleration approaches of physical simulation: they can be used in combination with these existing approaches, such as sparse data structures, for even higher scalability and performance. 
    more » « less
  2. ABSTRACT

    The analysis of particles bound to surfaces by tethers can facilitate understanding of biophysical phenomena (e.g., DNA–protein or protein–ligand interactions and DNA extensibility). Modeling such systems theoretically aids in understanding experimentally observed motions, and the limitations of such models can provide insight into modeling complex systems. The simulation of tethered particle motion (TPM) allows for analysis of complex behaviors exhibited by such systems; however, this type of experiment is rarely taught in undergraduate science classes. We have developed a MATLAB simulation package intended to be used in academic contexts to concisely model and graphically represent the behavior of different tether–particle systems. We show how analysis of the simulation results can be used in biophysical research using single-molecule force spectroscopy (SMFS). Students in physics, engineering, and chemistry will be able to make connections with principles embedded in the field of study and understand how those principles can be used to create meaningful conclusions in a multidisciplinary context. The simulation package can model any given tether–particle system and allows the user to generate a parameter space with static and dynamic model components. Our simulation was successfully able to recreate generally observed experimental trends by using acoustic force spectroscopy (AFS). Further, the simulation was validated through consideration of the conservation of energy of the tether–bead system, trend analyses, and comparison of particle positional data from actual TPM in silico experiments conducted to simulate data with a parameter space similar to the AFS experimental setup. Overall, our TPM simulator and graphical user interface is primarily for demonstrating behaviors characteristic to TPM in a classroom setting but can serve as a template for researchers to set up TPM simulations to mimic a specific SMFS experimental setup.

     
    more » « less
  3. This work validates lumped-parameter models and cable-based models for nets against data from a parabolic flight experiment. The capabilities of a simulator based in Vortex Studio, a multibody dynamics simulation framework, are expanded by introducing i) a lumped-parameter model of the net with lumped masses placed along the threads and ii) a flexible-cable-based model, both of which enable collision detection with thin bodies. An experimental scenario is recreated in simulation, and the deployment and capture phases are analyzed. Good agreement with experiments is observed in both phases, although with differences primarily due to imperfect knowledge of experimental initial conditions. It is demonstrated that both a lumped-parameter model with inner nodes and a cable-based model can enable the detection of collisions between the net and thin geometries of the target. While both models improve notably capture realism compared to a lumped parameter model with no inner nodes, the cable-based model is found to be most computationally efficient. The effect of modeling thread-to-thread collisions (i.e., collisions among parts of the net) is analyzed and determined to be negligible during deployment and initial target wrapping. The results of this work validate the models and increase the confidence in the practicality of this simulator as a tool for research on net-based capture of debris. A cable-based model is validated for the first time in the literature. 
    more » « less
  4. Abstract Statistical relational learning (SRL) frameworks are effective at defining probabilistic models over complex relational data. They often use weighted first-order logical rules where the weights of the rules govern probabilistic interactions and are usually learned from data. Existing weight learning approaches typically attempt to learn a set of weights that maximizes some function of data likelihood; however, this does not always translate to optimal performance on a desired domain metric, such as accuracy or F1 score. In this paper, we introduce a taxonomy of search-based weight learning approaches for SRL frameworks that directly optimize weights on a chosen domain performance metric. To effectively apply these search-based approaches, we introduce a novel projection, referred to as scaled space (SS), that is an accurate representation of the true weight space. We show that SS removes redundancies in the weight space and captures the semantic distance between the possible weight configurations. In order to improve the efficiency of search, we also introduce an approximation of SS which simplifies the process of sampling weight configurations. We demonstrate these approaches on two state-of-the-art SRL frameworks: Markov logic networks and probabilistic soft logic. We perform empirical evaluation on five real-world datasets and evaluate them each on two different metrics. We also compare them against four other weight learning approaches. Our experimental results show that our proposed search-based approaches outperform likelihood-based approaches and yield up to a 10% improvement across a variety of performance metrics. Further, we perform an extensive evaluation to measure the robustness of our approach to different initializations and hyperparameters. The results indicate that our approach is both accurate and robust. 
    more » « less
  5. This paper introduces a library for cross-simulator comparison of reinforcement learning models in trafc signal control tasks. This library is developed to implement recent state-of-the-art reinforcement learning models with extensible interfaces and unifed crosssimulator evaluation metrics. It supports commonly-used simulators in trafc signal control tasks, including Simulation of Urban MObility(SUMO) and CityFlow, and multiple benchmark datasets for fair comparisons. We conducted experiments to validate our implementation of the models and to calibrate the simulators so that the experiments from one simulator could be referential to the other. Based on the validated models and calibrated environments, this paper compares and reports the performance of current state-of-theart RL algorithms across diferent datasets and simulators. This is the frst time that these methods have been compared fairly under the same datasets with diferent simulators. 
    more » « less