skip to main content

Title: Basc: Applying Bayesian optimization to the search for global minima on potential energy surfaces
We present a novel application of Bayesian optimization to the field of surface science: rapidly and accurately searching for the global minimum on potential energy surfaces. Controlling molecule-surface interactions is key for applications ranging from environmental catalysis to gas sensing. We present pragmatic techniques, including exploration/exploitation scheduling and a custom covariance kernel that encodes the properties of our objective function. Our method, the Bayesian Active Site Calculator (BASC), outperforms differential evolution and constrained minima hopping - two state-of-the-art approaches - in trial examples of carbon monoxide adsorption on a hematite substrate, both with and without a defect.
; ;
Award ID(s):
Publication Date:
Journal Name:
33rd International Conference on Machine Learning, ICML 2016
Sponsoring Org:
National Science Foundation
More Like this
  1. We present a new calibration of the peak absolute magnitude of Type Ia supernovae (SNe Ia) based on the surface brightness fluctuations (SBF) method, aimed at measuring the value of the Hubble constant. We build a sample of calibrating anchors consisting of 24 SNe hosted in galaxies that have SBF distance measurements. Applying a hierarchical Bayesian approach, we calibrate the SN Ia peak luminosity and extend the Hubble diagram into the Hubble flow by using a sample of 96 SNe Ia in the redshift range 0.02 <  z  < 0.075, which was extracted from the Combined Pantheon Sample. We estimate a valuemore »of H 0  = 70.50 ± 2.37 (stat.) ± 3.38 (sys.) km s −1 Mpc −1 (i.e., 3.4% stat., 4.8% sys.), which is in agreement with the value obtained using the tip of the red giant branch calibration. It is also consistent, within errors, with the value obtained from SNe Ia calibrated with Cepheids or the value inferred from the analysis of the cosmic microwave background. We find that the SNe Ia distance moduli calibrated with SBF are on average larger by 0.07 mag than those calibrated with Cepheids. Our results point to possible differences among SNe in different types of galaxies, which could originate from different local environments and/or progenitor properties of SNe Ia. Sampling different host galaxy types, SBF offers a complementary approach to using Cepheids, which is important in addressing possible systematics. As the SBF method has the ability to reach larger distances than Cepheids, the impending entry of the Vera C. Rubin Observatory and JWST into operation will increase the number of SNe Ia hosted in galaxies where SBF distances can be measured, making SBF measurements attractive for improving the calibration of SNe Ia, as well as in the estimation of H 0 .« less

    We present a new method of matching observations of Type-I (thermonuclear) X-ray bursts with models, comparing the predictions of a semi-analytic ignition model with X-ray observations of the accretion-powered millisecond pulsar SAX J1808.4–3658 in outburst. We used a Bayesian analysis approach to marginalize over the parameters of interest and determine parameters such as fuel composition, distance/anisotropy factors, neutron star mass, and neutron star radius. Our study includes a treatment of the system inclination effects, inferring that the rotation axis of the system is inclined $\left(69^{+4}_{-2}\right)^\circ$ from the observers line of sight, assuming a flat disc model. This method canmore »be applied to any accreting source that exhibits Type-I X-ray bursts. We find a hydrogen mass fraction of $0.57^{+0.13}_{-0.14}$ and CNO metallicity of $0.013^{+0.006}_{-0.004}$ for the accreted fuel is required by the model to match the observed burst energies, for a distance to the source of $3.3^{+0.3}_{-0.2}\, \mathrm{kpc}$. We infer a neutron star mass of $1.5^{+0.6}_{-0.3}\, \mathrm{M}_{\odot }$ and radius of $11.8^{+1.3}_{-0.9}\, \mathrm{km}$ for a surface gravity of $1.9^{+0.7}_{-0.4}\times 10^{14}\, \mathrm{cm}\, \mathrm{s}^{-2}$ for SAX J1808.4–3658.

    « less
  3. Abstract We present the first in a series of dataset and model assessment products for investigating Africa’s lithosphere (ADAMA). This is a comprehensive catalog of short-period interstation surface-wave dispersion measurements and uncertainties. It is derived from processing continuous recordings of all publicly available three-component seismograms, spanning four decades, from ∼1372 stations, across 62 seismic networks deployed in and around the African continent. It includes Love- and Rayleigh-wave dispersion derived from frequency-domain ambient noise cross-correlation functions (NCFs). Phase and group dispersion, as well as their uncertainties, are then obtained with an iterative nonlinear waveform fitting of the NCFs, using a spectralmore »element representation of a path-average a priori Earth model. Our catalog represents the following advances: (1) a large distribution of short period dispersion measurements: ∼114,000 interstation pairs at periods between 5 s and 40 s, (2) inclusion of uncertainties useful for regularization in continent-wide model building, (3) preliminary model assessments for different tectonic domains on the continent, and (4) an exemplary Love-wave phase velocity map obtained by Bayesian inversion revealing detailed features not previously detected. ADAMA will be used to prepare short-period, high-resolution dispersion maps, and for assessment and updates of widely used seismic velocity models of the crust across a diversity of terranes on the continent.« less
  4. Free energies as a function of a selected set of collective variables are commonly computed in molecular simulation and of significant value in understanding and engineering molecular behavior. These free energy surfaces are most commonly estimated using variants of histogramming techniques, but such approaches obscure two important facets of these functions. First, the empirical observations along the collective variable are defined by an ensemble of discrete observations, and the coarsening of these observations into a histogram bin incurs unnecessary loss of information. Second, the free energy surface is itself almost always a continuous function, and its representation by a histogrammore »introduces inherent approximations due to the discretization. In this study, we relate the observed discrete observations from biased simulations to the inferred underlying continuous probability distribution over the collective variables and derive histogram-free techniques for estimating this free energy surface. We reformulate free energy surface estimation as minimization of a Kullback−Leibler divergence between a continuous trial function and the discrete empirical distribution and show that this is equivalent to likelihood maximization of a trial function given a set of sampled data. We then present a fully Bayesian treatment of this formalism, which enables the incorporation of powerful Bayesian tools such as the inclusion of regularizing priors, uncertainty quantification, and model selection techniques. We demonstrate this new formalism in the analysis of umbrella sampling simulations for the χ torsion of a valine side chain in the L99A mutant of T4 lysozyme with benzene bound in the cavity.« less
  5. Ensuring that all the teeth surfaces are adequately covered during daily brushing can reduce the risk of several oral diseases. In this paper, we propose the mTeeth model to detect teeth surfaces being brushed with a manual toothbrush in the natural free-living environment using wrist-worn inertial sensors. To unambiguously label sensor data corresponding to different surfaces and capture all transitions that last only milliseconds, we present a lightweight method to detect the micro-event of brushing strokes that cleanly demarcates transitions among brushing surfaces. Using features extracted from brushing strokes, we propose a Bayesian Ensemble method that leverages the natural hierarchymore »among teeth surfaces and patterns of transition among them. For training and testing, we enrich a publicly-available wrist-worn inertial sensor dataset collected from the natural environment with time-synchronized precise labels of brushing surface timings and moments of transition. We annotate 10,230 instances of brushing on different surfaces from 114 episodes and evaluate the impact of wide between-person and within-person between-episode variability on machine learning model's performance for brushing surface detection.« less