Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Abstract Recently, a class of algorithms combining classical fixed‐point iterations with repeated random sparsification of approximate solution vectors has been successfully applied to eigenproblems with matrices as large as . So far, a complete mathematical explanation for this success has proven elusive. The family of methods has not yet been extended to the important case of linear system solves. In this paper, we propose a new scheme based on repeated random sparsification that is capable of solving sparse linear systems in arbitrarily high dimensions. We provide a complete mathematical analysis of this new algorithm. Our analysis establishes a faster‐than‐Monte Carlo convergence rate and justifies use of the scheme even when the solution is too large to store as a dense vector.more » « lessFree, publicly-accessible full text available September 8, 2026
-
Abstract Mercury’s orbit can destabilize, generally resulting in a collision with either Venus or the Sun. Chaotic evolution can causeg1to decrease to the approximately constant value ofg5and create a resonance. Previous work has approximated the variation ing1as stochastic diffusion, which leads to a phenomological model that can reproduce the Mercury instability statistics of secular andN-body models on timescales longer than 10 Gyr. Here we show that the diffusive model significantly underpredicts the Mercury instability probability on timescales less than 5 Gyr, the remaining lifespan of the solar system. This is becauseg1exhibits larger variations on short timescales than the diffusive model would suggest. To better model the variations on short timescales, we build a new subdiffusive phenomological model forg1. Subdiffusion is similar to diffusion but exhibits larger displacements on short timescales and smaller displacements on long timescales. We choose model parameters based on the behavior of theg1trajectories in theN-body simulations, leading to a tuned model that can reproduce Mercury instability statistics from 1–40 Gyr. This work motivates fundamental questions in solar system dynamics: why does subdiffusion better approximate the variation ing1than standard diffusion? Why is there an upper bound ong1, but not a lower bound that would prevent it from reachingg5?more » « less
-
Abstract Extreme weather events have significant consequences, dominating the impact of climate on society. While high‐resolution weather models can forecast many types of extreme events on synoptic timescales, long‐term climatological risk assessment is an altogether different problem. A once‐in‐a‐century event takes, on average, 100 years of simulation time to appear just once, far beyond the typical integration length of a weather forecast model. Therefore, this task is left to cheaper, but less accurate, low‐resolution or statistical models. But there is untapped potential in weather model output: despite being short in duration, weather forecast ensembles are produced multiple times a week. Integrations are launched with independent perturbations, causing them to spread apart over time and broadly sample phase space. Collectively, these integrations add up to thousands of years of data. We establish methods to extract climatological information from these short weather simulations. Using ensemble hindcasts by the European Center for Medium‐range Weather Forecasting archived in the subseasonal‐to‐seasonal (S2S) database, we characterize sudden stratospheric warming (SSW) events with multi‐centennial return times. Consistent results are found between alternative methods, including basic counting strategies and Markov state modeling. By carefully combining trajectories together, we obtain estimates of SSW frequencies and their seasonal distributions that are consistent with reanalysis‐derived estimates for moderately rare events, but with much tighter uncertainty bounds, and which can be extended to events of unprecedented severity that have not yet been observed historically. These methods hold potential for assessing extreme events throughout the climate system, beyond this example of stratospheric extremes.more » « less
-
Free, publicly-accessible full text available September 30, 2026
-
Identifying informative low-dimensional features that characterize dynamics in molecular simulations remains a challenge, often requiring extensive manual tuning and system-specific knowledge. Here, we introduce geom2vec, in which pretrained graph neural networks (GNNs) are used as universal geometric featurizers. By pretraining equivariant GNNs on a large dataset of molecular conformations with a self-supervised denoising objective, we obtain transferable structural representations that are useful for learning conformational dynamics without further fine-tuning. We show how the learned GNN representations can capture interpretable relationships between structural units (tokens) by combining them with expressive token mixers. Importantly, decoupling training the GNNs from training for downstream tasks enables analysis of larger molecular graphs (that can represent small proteins at all-atom resolution) with limited computational resources. In these ways, geom2vec eliminates the need for manual feature selection and increases the robustness of simulation analyses.more » « lessFree, publicly-accessible full text available January 28, 2026
-
An issue for molecular dynamics simulations is that events of interest often involve timescales that are much longer than the simulation time step, which is set by the fastest timescales of the model. Because of this timescale separation, direct simulation of many events is prohibitively computationally costly. This issue can be overcome by aggregating information from many relatively short simulations that sample segments of trajectories involving events of interest. This is the strategy of Markov state models (MSMs) and related approaches, but such methods suffer from approximation error because the variables defining the states generally do not capture the dynamics fully. By contrast, once converged, the weighted ensemble (WE) method aggregates information from trajectory segments so as to yield unbiased estimates of both thermodynamic and kinetic statistics. Unfortunately, errors decay no faster than unbiased simulation in WE as originally formulated and commonly deployed. Here, we introduce a theoretical framework for describing WE that shows that the introduction of an approximate stationary distribution on top of the stratification, as in nonequilibrium umbrella sampling (NEUS), accelerates convergence. Building on ideas from MSMs and related methods, we generalize the NEUS approach in such a way that the approximation error can be reduced systematically. We show that the improved algorithm can decrease the simulation time required to achieve the desired precision by orders of magnitude.more » « less
-
Many chemical reactions and molecular processes occur on time scales that are significantly longer than those accessible by direct simulations. One successful approach to estimating dynamical statistics for such processes is to use many short time series of observations of the system to construct a Markov state model, which approximates the dynamics of the system as memoryless transitions between a set of discrete states. The dynamical Galerkin approximation (DGA) is a closely related framework for estimating dynamical statistics, such as committors and mean first passage times, by approximating solutions to their equations with a projection onto a basis. Because the projected dynamics are generally not memoryless, the Markov approximation can result in significant systematic errors. Inspired by quasi-Markov state models, which employ the generalized master equation to encode memory resulting from the projection, we reformulate DGA to account for memory and analyze its performance on two systems: a two-dimensional triple well and the AIB9 peptide. We demonstrate that our method is robust to the choice of basis and can decrease the time series length required to obtain accurate kinetics by an order of magnitude.more » « less
-
Understanding dynamics in complex systems is challenging because there are many degrees of freedom, and those that are most important for describing events of interest are often not obvious. The leading eigenfunctions of the transition operator are useful for visualization, and they can provide an efficient basis for computing statistics, such as the likelihood and average time of events (predictions). Here, we develop inexact iterative linear algebra methods for computing these eigenfunctions (spectral estimation) and making predictions from a dataset of short trajectories sampled at finite intervals. We demonstrate the methods on a low-dimensional model that facilitates visualization and a high-dimensional model of a biomolecular system. Implications for the prediction problem in reinforcement learning are discussed.more » « less
-
Many sampling strategies commonly used in molecular dynamics, such as umbrella sampling and alchemical free energy methods, involve sampling from multiple states. The Multistate Bennett Acceptance Ratio (MBAR) formalism is a widely used way of recombining the resulting data. However, the error of the MBAR estimator is not well-understood: previous error analyses of MBAR assumed independent samples. In this work, we derive a central limit theorem for MBAR estimates in the presence of correlated data, further justifying the use of MBAR in practical applications. Moreover, our central limit theorem yields an estimate of the error that can be decomposed into contributions from the individual Markov chains used to sample the states. This gives additional insight into how sampling in each state affects the overall error. We demonstrate our error estimator on an umbrella sampling calculation of the free energy of isomerization of the alanine dipeptide and an alchemical calculation of the hydration free energy of methane. Our numerical results demonstrate that the time required for the Markov chain to decorrelate in individual states can contribute considerably to the total MBAR error, highlighting the importance of accurately addressing the effect of sample correlation.more » « less
An official website of the United States government
