skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Surveying the energy landscape of coarse-grained mappings
Simulations of soft materials often adopt low-resolution coarse-grained (CG) models. However, the CG representation is not unique and its impact upon simulated properties is poorly understood. In this work, we investigate the space of CG representations for ubiquitin, which is a typical globular protein with 72 amino acids. We employ Monte Carlo methods to ergodically sample this space and to characterize its landscape. By adopting the Gaussian network model as an analytically tractable atomistic model for equilibrium fluctuations, we exactly assess the intrinsic quality of each CG representation without introducing any approximations in sampling configurations or in modeling interactions. We focus on two metrics, the spectral quality and the information content, that quantify the extent to which the CG representation preserves low-frequency, large-amplitude motions and configurational information, respectively. The spectral quality and information content are weakly correlated among high-resolution representations but become strongly anticorrelated among low-resolution representations. Representations with maximal spectral quality appear consistent with physical intuition, while low-resolution representations with maximal information content do not. Interestingly, quenching studies indicate that the energy landscape of mapping space is very smooth and highly connected. Moreover, our study suggests a critical resolution below which a “phase transition” qualitatively distinguishes good and bad representations.  more » « less
Award ID(s):
2154433 1856337
PAR ID:
10499897
Author(s) / Creator(s):
; ;
Publisher / Repository:
American Institute of Physics
Date Published:
Journal Name:
The Journal of Chemical Physics
Volume:
160
Issue:
5
ISSN:
0021-9606
Page Range / eLocation ID:
054105
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Low-resolution coarse-grained (CG) models provide significant computational and conceptual advantages for simulating soft materials. However, the properties of CG models depend quite sensitively upon the mapping, M, that maps each atomic configuration, r, to a CG configuration, R. In particular, M determines how the configurational information of the atomic model is partitioned between the mapped ensemble of CG configurations and the lost ensemble of atomic configurations that map to each R. In this work, we investigate how the mapping partitions the atomic configuration space into CG and intra-site components. We demonstrate that the corresponding coordinate transformation introduces a nontrivial Jacobian factor. This Jacobian factor defines a labeling entropy that corresponds to the uncertainty in the atoms that are associated with each CG site. Consequently, the labeling entropy effectively transfers configurational information from the lost ensemble into the mapped ensemble. Moreover, our analysis highlights the possibility of resonant mappings that separate the atomic potential into CG and intra-site contributions. We numerically illustrate these considerations with a Gaussian network model for the equilibrium fluctuations of actin. We demonstrate that the spectral quality, Q, provides a simple metric for identifying high quality representations for actin. Conversely, we find that neither maximizing nor minimizing the information content of the mapped ensemble results in high quality representations. However, if one accounts for the labeling uncertainty, Q(M) correlates quite well with the adjusted configurational information loss, Îmap(M), that results from the mapping. 
    more » « less
  2. Significance Physical phenomena can often be described by surprisingly few order parameters. Unfortunately, it is challenging to identify these essential degrees of freedom. Here we develop a statistical physics framework for exploring the landscape of order parameters, or coarse-grained representations, for a microscopic protein model. We employ Monte Carlo methods to statistically characterize this landscape. We define metrics assessing the intrinsic quality of each representation for preserving the configurational information and large-scale motions of the underlying microscopic model. Interestingly, these metrics are anticorrelated in low-resolution representations. Moreover, below a critical resolution, a phase transition qualitatively distinguishes superior and inferior representations. Finally, we relate our work to recent approaches for clustering graphs and detecting communities in networks. 
    more » « less
  3. A geologic map is both a visual depiction of the lithologies and structures occurring at the Earth’s surface and a representation of a conceptual model for the geologic history in a region. The work needed to capture such multifaced information in an accurate geologic map is time consuming. Remote sensing can complement traditional primary field observations, geochemistry, chronometry, and subsurface geophysical data in providing useful information to assist with the geologic mapping process. Two novel sources of remote sensing data are particularly relevant for geologic mapping applications: decameter-resolution imaging spectroscopy (spectroscopic imaging) and meter-resolution multispectral shortwave infrared (SWIR) imaging. Decameter spectroscopic imagery can capture important mineral absorptions but is frequently unable to spatially resolve important geologic features. Meter-resolution multispectral SWIR images are better able to resolve fine spatial features but offer reduced spectral information. Such disparate but complementary datasets can be challenging to integrate into the geologic mapping process. Here, we conduct a comparative analysis of spatial and spectral scaling for two such datasets: one Airborne Visible/Infrared Imaging Spectrometer—Classic (AVIRIS-classic) flightline, and one WorldView-3 (WV3) scene, for a geologically complex landscape in Anza-Borrego Desert State Park, California. To do so, we use a two-stage framework that synthesizes recent advances in the spectral mixture residual and joint characterization. The mixture residual uses the wavelength-explicit misfit of a linear spectral mixture model to capture low variance spectral signals. Joint characterization utilizes nonlinear dimensionality reduction (manifold learning) to visualize spectral feature space topology and identify clusters of statistically similar spectra. For this study area, the spectral mixture residual clearly reveals greater spectral dimensionality in AVIRIS than WorldView (99% of variance in 39 versus 5 residual dimensions). Additionally, joint characterization shows more complex spectral feature space topology for AVIRIS than WorldView, revealing information useful to the geologic mapping process in the form of mineralogical variability both within and among mapped geologic units. These results illustrate the potential of recent and planned imaging spectroscopy missions to complement high-resolution multispectral imagery—along with field and lab observations—in planning, collecting, and interpreting the results from geologic field work. 
    more » « less
  4. Bottom-up methods for coarse-grained (CG) molecular modeling are critically needed to establish rigorous links between atomistic reference data and reduced molecular representations. For a target molecule, the ideal reduced CG representation is a function of both the conformational ensemble of the system and the target physical observable(s) to be reproduced at the CG resolution. However, there is an absence of algorithms for selecting CG representations of molecules from which complex properties, including molecular electronic structure, can be accurately modeled. We introduce continuously gated message passing (CGMP), a graph neural network (GNN) method for atomically decomposing molecular electronic structure sampled over conformational ensembles. CGMP integrates 3D-invariant GNNs and a novel gated message passing system to continuously reduce the atomic degrees of freedom accessible for electronic predictions, resulting in a one-shot importance ranking of atoms contributing to a target molecular property. Moreover, CGMP provides the first approach by which to quantify the degeneracy of “good” CG representations conditioned on specific prediction targets, facilitating the development of more transferable CG representations. We further show how CGMP can be used to highlight multiatom correlations, illuminating a path to developing CG electronic Hamiltonians in terms of interpretable collective variables for arbitrarily complex molecules. 
    more » « less
  5. The maximal coding rate reduction (MCR2) objective for learning structured and compact deep representations is drawing increasing attention, especially after its recent usage in the derivation of fully explainable and highly effective deep network architectures. However, it lacks a complete theoretical justification: only the properties of its global optima are known, and its global landscape has not been studied. In this work, we give a complete characterization of the properties of all its local and global optima, as well as other types of critical points. Specifically, we show that each (local or global) maximizer of the MCR2 problem corresponds to a low-dimensional, discriminative, and diverse representation, and furthermore, each critical point of the objective is either a local maximizer or a strict saddle point. Such a favorable landscape makes MCR2 a natural choice of objective for learning diverse and discriminative representations via first-order optimization methods. To validate our theoretical findings, we conduct extensive experiments on both synthetic and real data sets. 
    more » « less