skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Construction of coarse-grained molecular dynamics with many-body non-Markovian memory
We introduce a machine-learning-based coarse-grained molecular dynamics (CGMD) model that faithfully retains the many-body nature of the inter-molecular dissipative interactions. Unlike the common empirical CG models, the present model is constructed based on the Mori-Zwanzig formalism and naturally inherits the heterogeneous state-dependent memory term rather than matching the mean-field metrics such as the velocity auto-correlation function. Numerical results show that preserving the many-body nature of the memory term is crucial for predicting the collective transport and diffusion processes, where empirical forms generally show limitations.  more » « less
Award ID(s):
2110981
PAR ID:
10416726
Author(s) / Creator(s):
Date Published:
Journal Name:
arXivorg
ISSN:
2331-8422
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Calculation of many-body correlation functions is one of the critical kernels utilized in many scientific computing areas, especially in Lattice Quantum Chromodynamics (Lattice QCD). It is formalized as a sum of a large number of contraction terms each of which can be represented by a graph consisting of vertices describing quarks inside a hadron node and edges designating quark propagations at specific time intervals. Due to its computation- and memory-intensive nature, real-world physics systems (e.g., multi-meson or multi-baryon systems) explored by Lattice QCD prefer to leverage multi-GPUs. Different from general graph processing, many-body correlation function calculations show two specific features: a large number of computation-/data-intensive kernels and frequently repeated appearances of original and intermediate data. The former results in expensive memory operations such as tensor movements and evictions. The latter offers data reuse opportunities to mitigate the data-intensive nature of many-body correlation function calculations. However, existing graph-based multi-GPU schedulers cannot capture these data-centric features, thus resulting in a sub-optimal performance for many-body correlation function calculations. To address this issue, this paper presents a multi-GPU scheduling framework, MICCO, to accelerate contractions for correlation functions particularly by taking the data dimension (e.g., data reuse and data eviction) into account. This work first performs a comprehensive study on the interplay of data reuse and load balance, and designs two new concepts: local reuse pattern and reuse bound to study the opportunity of achieving the optimal trade-off between them. Based on this study, MICCO proposes a heuristic scheduling algorithm and a machine-learning-based regression model to generate the optimal setting of reuse bounds. Specifically, MICCO is integrated into a real-world Lattice QCD system, Redstar, for the first time running on multiple GPUs. The evaluation demonstrates MICCO outperforms other state-of-art works, achieving up to 2.25× speedup in synthesized datasets, and 1.49× speedup in real-world correlation functions. 
    more » « less
  2. Understanding memory in the context of data visualizations is paramount for effective design. While immediate clarity in a visualization is crucial, retention of its information determines its long‐term impact. While extensive research has underscored the elements enhancing visualization memorability, a limited body of work has delved into modeling the recall process. This study investigates the temporal dynamics of visualization recall, focusing on factors influencing recollection, shifts in recall veracity, and the role of participant demographics. Using data from an empirical study (n = 104), we propose a novel approach combining temporal clustering and handcrafted features to model recall over time. A long short‐term memory (LSTM) model with attention mechanisms predicts recall patterns, revealing alignment with informativeness scores and participant characteristics. Our findings show that perceived informativeness dictates recall focus, with more informative visualizations eliciting narrative‐driven insights and less informative ones prompting aesthetic‐driven responses. Recall accuracy diminishes over time, particularly for unfamiliar visualizations, with age and education significantly shaping recall emphases. These insights advance our understanding of visualization recall, offering practical guidance for designing visualizations that enhance retention and comprehension. All data and materials are available at: https://osf.io/ghe2j/'>https://osf.io/ghe2j/. 
    more » « less
  3. null (Ed.)
    Attention mechanisms have led to many breakthroughs in sequential data modeling but have yet to be incorporated into any generative algorithms for molecular design. Here we explore the impact of adding self-attention layers to generative β -VAE models and show that those with attention are able to learn a complex “molecular grammar” while improving performance on downstream tasks such as accurately sampling from the latent space (“model memory”) or exploring novel chemistries not present in the training data. There is a notable relationship between a model's architecture, the structure of its latent memory and its performance during inference. We demonstrate that there is an unavoidable tradeoff between model exploration and validity that is a function of the complexity of the latent memory. However, novel sampling schemes may be used that optimize this tradeoff. We anticipate that attention will play an important role in future molecular design algorithms that can make efficient use of the detailed molecular substructures learned by the transformer. 
    more » « less
  4. Physical parameterizations (or closures) are used as representations of unresolved subgrid processes within weather and global climate models or coarse-scale turbulent models, whose resolutions are too coarse to resolve small-scale processes. These parameterizations are typically grounded on physically based, yet empirical, representations of the underlying small-scale processes. Machine learning-based parameterizations have recently been proposed as an alternative solution and have shown great promise to reduce uncertainties associated with the parameterization of small-scale processes. Yet, those approaches still show some important mismatches that are often attributed to the stochasticity of the considered process. This stochasticity can be due to coarse temporal resolution, unresolved variables, or simply to the inherent chaotic nature of the process. To address these issues, we propose a new type of parameterization (closure), which is built using memory-based neural networks, to account for the non-instantaneous response of the closure and to enhance its stability and prediction accuracy. We apply the proposed memory-based parameterization, with differentiable solver, to the Lorenz ’96 model in the presence of a coarse temporal resolution and show its capacity to predict skillful forecasts over a long time horizon of the resolved variables compared to instantaneous parameterizations. This approach paves the way for the use of memory-based parameterizations for closure problems. 
    more » « less
  5. Emerging Virtual Reality (VR) displays with embedded eye trackers are currently becoming a commodity hardware (e.g., HTC Vive Pro Eye). Eye-tracking data can be utilized for several purposes, including gaze monitoring, privacy protection, and user authentication/identification. Identifying users is an integral part of many applications due to security and privacy concerns. In this paper, we explore methods and eye-tracking features that can be used to identify users. Prior VR researchers explored machine learning on motion-based data (such as body motion, head tracking, eye tracking, and hand tracking data) to identify users. Such systems usually require an explicit VR task and many features to train the machine learning model for user identification. We propose a system to identify users utilizing minimal eye-gaze-based features without designing any identification-specific tasks. We collected gaze data from an educational VR application and tested our system with two machine learning (ML) models, random forest (RF) and k-nearest-neighbors (kNN), and two deep learning (DL) models: convolutional neural networks (CNN) and long short-term memory (LSTM). Our results show that ML and DL models could identify users with over 98% accuracy with only six simple eye-gaze features. We discuss our results, their implications on security and privacy, and the limitations of our work. 
    more » « less