NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

The fast committor machine: Interpretable prediction with kernels

https://doi.org/10.1063/5.0222798

Aristoff, David; Johnson, Mats; Simpson, Gideon; Webber, Robert J (August 2024, The Journal of Chemical Physics)

In the study of stochastic systems, the committor function describes the probability that a system starting from an initial configuration x will reach a set B before a set A. This paper introduces an efficient and interpretable algorithm for approximating the committor, called the “fast committor machine” (FCM). The FCM uses simulated trajectory data to build a kernel-based model of the committor. The kernel function is constructed to emphasize low-dimensional subspaces that optimally describe the A to B transitions. The coefficients in the kernel model are determined using randomized linear algebra, leading to a runtime that scales linearly with the number of data points. In numerical experiments involving a triple-well potential and alanine dipeptide, the FCM yields higher accuracy and trains more quickly than a neural network with the same number of parameters. The FCM is also more interpretable than the neural net.
more » « less
Full Text Available
Featurizing Koopman mode decomposition for robust forecasting

https://doi.org/10.1063/5.0220277

Aristoff, David; Copperman, Jeremy; Mankovich, Nathan; Davies, Alexander (August 2024, The Journal of Chemical Physics)

This article introduces an advanced Koopman mode decomposition (KMD) technique—coined Featurized Koopman Mode Decomposition (FKMD)—that uses delay embedding and a learned Mahalanobis distance to enhance analysis and prediction of high-dimensional dynamical systems. The delay embedding expands the observation space to better capture underlying manifold structures, while the Mahalanobis distance adjusts observations based on the system’s dynamics. This aids in featurizing KMD in cases where good features are not a priori known. We show that FKMD improves predictions for a high-dimensional linear oscillator, a high-dimensional Lorenz attractor that is partially observed, and a cell signaling problem from cancer research.
more » « less
A Benchmark for the Bayesian Inversion of Coefficients in Partial Differential Equations

https://doi.org/10.1137/21M1399464

Aristoff, David; Bangerth, Wolfgang (November 2023, SIAM Review)
na (Ed.)
Bayesian methods have been widely used in the last two decades to infer statistical proper- ties of spatially variable coefficients in partial differential equations from measurements of the solutions of these equations. Yet, in many cases the number of variables used to param- eterize these coefficients is large, and oobtaining meaningful statistics of their probability distributions is difficult using simple sampling methods such as the basic Metropolis– Hastings algorithm—in particular, if the inverse problem is ill-conditioned or ill-posed. As a consequence, many advanced sampling methods have been described in the literature that converge faster than Metropolis–Hastings, for example, by exploiting hierarchies of statistical models or hierarchies of discretizations of the underlying differential equation. At the same time, it remains difficult for the reader of the literature to quantify the advantages of these algorithms because there is no commonly used benchmark. This paper presents a benchmark Bayesian inverse problem—namely, the determination of a spatially variable coefficient, discretized by 64 values, in a Poisson equation, based on point mea- surements of the solution—that fills the gap between widely used simple test cases (such as superpositions of Gaussians) and real applications that are difficult to replicate for de- velopers of sampling algorithms. We provide a complete description of the test case and provide an open-source implementation that can serve as the basis for further experiments. We have also computed 2 × 10^11 samples, at a cost of some 30 CPU years, of the poste- rior probability distribution from which we have generated detailed and accurate statistics against which other sampling algorithms can be tested.
more » « less
Full Text Available
Arbitrarily accurate, nonparametric coarse graining with Markov renewal processes and the Mori–Zwanzig formulation

https://doi.org/10.1063/5.0162440

Aristoff, David; Johnson, Mats; Perez, Danny (September 2023, AIP Advances)

Stochastic dynamics, such as molecular dynamics, are important in many scientific applications. However, summarizing and analyzing the results of such simulations is often challenging due to the high dimension in which simulations are carried out and, consequently, due to the very large amount of data that are typically generated. Coarse graining is a popular technique for addressing this problem by providing compact and expressive representations. Coarse graining, however, potentially comes at the cost of accuracy, as dynamical information is, in general, lost when projecting the problem in a lower-dimensional space. This article shows how to eliminate coarse-graining error using two key ideas. First, we represent coarse-grained dynamics as a Markov renewal process. Second, we outline a data-driven, non-parametric Mori–Zwanzig approach for computing jump times of the renewal process. Numerical tests on a small protein illustrate the method.
more » « less
Weighted ensemble: Recent mathematical developments

https://doi.org/10.1063/5.0110873

Aristoff, D.; Copperman, J.; Simpson, G.; Webber, R. J.; Zuckerman, D. M. (January 2023, The Journal of Chemical Physics)

Weighted ensemble (WE) is an enhanced sampling method based on periodically replicating and pruning trajectories generated in parallel. WE has grown increasingly popular for computational biochemistry problems due, in part, to improved hardware and accessible software implementations. Algorithmic and analytical improvements have played an important role, and progress has accelerated in recent years. Here, we discuss and elaborate on the WE method from a mathematical perspective, highlighting recent results that enhance the computational efficiency. The mathematical theory reveals a new strategy for optimizing trajectory management that approaches the best possible variance while generalizing to systems of arbitrary dimension.
more » « less
An ergodic theorem for the weighted ensemble method

https://doi.org/10.1017/jpr.2021.38

Aristoff, David (March 2022, Journal of Applied Probability)

Abstract We study weighted ensemble, an interacting particle method for sampling distributions of Markov chains that has been used in computational chemistry since the 1990s. Many important applications of weighted ensemble require the computation of long time averages. We establish the consistency of weighted ensemble in this setting by proving an ergodic theorem for time averages. As part of the proof, we derive explicit variance formulas that could be useful for optimizing the method.
more » « less
Full Text Available

Search for: All records