skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Weighted ensemble: Recent mathematical developments
Weighted ensemble (WE) is an enhanced sampling method based on periodically replicating and pruning trajectories generated in parallel. WE has grown increasingly popular for computational biochemistry problems due, in part, to improved hardware and accessible software implementations. Algorithmic and analytical improvements have played an important role, and progress has accelerated in recent years. Here, we discuss and elaborate on the WE method from a mathematical perspective, highlighting recent results that enhance the computational efficiency. The mathematical theory reveals a new strategy for optimizing trajectory management that approaches the best possible variance while generalizing to systems of arbitrary dimension.  more » « less
Award ID(s):
2111278 2111277
PAR ID:
10440598
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
American Institute of Physics
Date Published:
Journal Name:
The Journal of Chemical Physics
Volume:
158
Issue:
1
ISSN:
0021-9606
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Image registration has been widely studied over the past several decades, with numerous applications in science, engineering and medicine. Most of the conventional mathematical models for large deformation image registration rely on prescribed landmarks, which usually require tedious manual labeling. In recent years, there has been a surge of interest in the use of machine learning for image registration. In this paper, we develop a novel method for large deformation image registration by a fusion of quasiconformal theory and convolutional neural network (CNN). More specifically, we propose a quasiconformal energy model with a novel fidelity term that incorporates the features extracted using a pre-trained CNN, thereby allowing us to obtain meaningful registration results without any guidance of prescribed landmarks. Moreover, unlike many prior image registration methods, the bijectivity of our method is guaranteed by quasiconformal theory. Experimental results are presented to demonstrate the effectiveness of the proposed method. More broadly, our work sheds light on how rigorous mathematical theories and practical machine learning approaches can be integrated for developing computational methods with improved performance. 
    more » « less
  2. Metric magnitude of a point cloud is a measure of its ``size. It has been adapted to various mathematical contexts and recent work suggests that it can enhance machine learning and optimization algorithms. But its usability is limited due to the computational cost when the dataset is large or when the computation must be carried out repeatedly (e.g. in model training). In this paper, we study the magnitude computation problem, and show efficient ways of approximating it. We show that it can be cast as a convex optimization problem, but not as a submodular optimization. The paper describes two new algorithms -- an iterative approximation algorithm that converges fast and is accurate in practice, and a subset selection method that makes the computation even faster. It has previously been proposed that the magnitude of model sequences generated during stochastic gradient descent is correlated to the generalization gap. Extension of this result using our more scalable algorithms shows that longer sequences bear higher correlations. We also describe new applications of magnitude in machine learning -- as an effective regularizer for neural network training, and as a novel clustering criterion. 
    more » « less
  3. Abstract Contemporary science is a field that is becoming increasingly computational. Today’s scientists not only leverage computational tools to conduct their investigations, they often must contribute to the design of the computational tools for their specific research. From a science education perspective, for students to learn authentic science practices, students must learn to use the tools of the trade. This necessity in science education has shaped recent K–12 science standards including the Next Generation Science Standards, which explicitly mention the use of computational tools and simulations. These standards, in particular, have gone further and mandated thatcomputational thinkingbe taught and leveraged as a practice of science. While computational thinking is not a new term, its inclusion in K–12 science standards has led to confusion about what the term means in the context of science learning and to questions about how to differentiate computational thinking from other commonly taught cognitive skills in science like problem-solving, mathematical reasoning, and critical thinking. In this paper, we propose a definition ofcomputational thinking for science(CT-S) and a framework for its operationalization in K–12 science education. We situate our definition and framework in Activity Theory, from the learning sciences, in order to position computational thinking as an input to and outcome of science learning that is mediated by computational tools. 
    more » « less
  4. Subdata selection from big data is an active area of research that facilitates inferences based on big data with limited computational expense. For linear regression models, the optimal design-inspired Information-Based Optimal Subdata Selection (IBOSS) method is a computationally efficient method for selecting subdata that has excellent statistical properties. But the method can only be used if the subdata size, k, is at last twice the number of regression variables, p. In addition, even when $$k\ge 2p$$, under the assumption of effect sparsity, one can expect to obtain subdata with better statistical properties by trying to focus on active variables. Inspired by recent efforts to extend the IBOSS method to situations with a large number of variables p, we introduce a method called Combining Lasso And Subdata Selection (CLASS) that, as shown, improves on other proposed methods in terms of variable selection and building a predictive model based on subdata when the full data size n is very large and the number of variables p is large. In terms of computational expense, CLASS is more expensive than recent competitors for moderately large values of n, but the roles reverse under effect sparsity for extremely large values of n. 
    more » « less
  5. null (Ed.)
    Tight junctions form a barrier to control passive transport of ions and small molecules across epithelia and endothelia. In addition to forming a barrier, some of claudins control transport properties of tight junctions by forming charge- and size-selective ion channels. It has been suggested claudin monomers can form or incorporate into tight junction strands to form channels. Resolving the crystallographic structure of several claudins in recent years has provided an opportunity to examine structural basis of claudins in tight junctions. Computational and theoretical modeling relying on atomic description of the pore have contributed significantly to our understanding of claudin pores and paracellular transport. In this paper, we review recent computational and mathematical modeling of claudin barrier function. We focus on dynamic modeling of global epithelial barrier function as a function of claudin pores and molecular dynamics studies of claudins leading to a functional model of claudin channels. 
    more » « less