skip to main content


Title: Basc: Applying Bayesian optimization to the search for global minima on potential energy surfaces
We present a novel application of Bayesian optimization to the field of surface science: rapidly and accurately searching for the global minimum on potential energy surfaces. Controlling molecule-surface interactions is key for applications ranging from environmental catalysis to gas sensing. We present pragmatic techniques, including exploration/exploitation scheduling and a custom covariance kernel that encodes the properties of our objective function. Our method, the Bayesian Active Site Calculator (BASC), outperforms differential evolution and constrained minima hopping - two state-of-the-art approaches - in trial examples of carbon monoxide adsorption on a hematite substrate, both with and without a defect.  more » « less
Award ID(s):
1355406
NSF-PAR ID:
10023538
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
33rd International Conference on Machine Learning, ICML 2016
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Standard approaches for functional principal components analysis rely on an eigendecomposition of a smoothed covariance surface in order to extract the orthonormal eigenfunctions representing the major modes of variation in a set of functional data. This approach can be a computationally intensive procedure, especially in the presence of large datasets with irregular observations. In this article, we develop a variational Bayesian approach, which aims to determine the Karhunen-Loève decomposition directly without smoothing and estimating a covariance surface. More specifically, we incorporate the notion of variational message passing over a factor graph because it removes the need for rederiving approximate posterior density functions if there is a change in the model. Instead, model changes are handled by changing specific computational units, known as fragments, within the factor graph – we demonstrate this with an extension to multilevel functional data. Indeed, this is the first article to address a functional data model via variational message passing. Our approach introduces three new fragments that are necessary for Bayesian functional principal components analysis. We present the computational details, a set of simulations for assessing the accuracy and speed of the variational message passing algorithm and an application to United States temperature data. 
    more » « less
  2. Abstract

    Chemically homogeneous evolution (CHE) is a promising channel for forming massive binary black holes. The enigmatic, massive Wolf–Rayet binary HD 5980 A&B has been proposed to have formed through this channel. We investigate this claim by comparing its observed parameters with CHE models. UsingMESA, we simulate grids of close massive binaries, then use a Bayesian approach to compare them with the stars’ observed orbital period, masses, luminosities, and hydrogen surface abundances. The most probable models, given the observational data, have initial periods ∼3 days, widening to the present-day ∼20 days orbit as a result of mass loss—correspondingly, they have very high initial stellar masses (≳150M). We explore variations in stellar-wind mass loss and internal mixing efficiency, and find that models assuming enhanced mass loss are greatly favored to explain HD 5980, while enhanced mixing is only slightly favored over our fiducial assumptions. Our most probable models slightly underpredict the hydrogen surface abundances. Regardless of its prior history, this system is a likely binary black hole progenitor. We model its further evolution under our fiducial and enhanced wind assumptions, finding that both stars produce black holes with masses ∼19–37M. The projected final orbit is too wide to merge within a Hubble time through gravitational waves alone. However, the system is thought to be part of a 2+2 hierarchical multiple. We speculate that secular effects with the (possible) third and fourth companions may drive the system to promptly become a gravitational-wave source.

     
    more » « less
  3. Abstract

    Surface soil moisture (SSM) has been identified as a key climate variable governing hydrologic and atmospheric processes across multiple spatial scales at local, regional, and global levels. The global burgeoning of SSM datasets in the past decade holds a significant potential in improving our understanding of multiscale SSM dynamics. The primary issues that hinder the fusion of SSM data from disparate instruments are (1) different spatial resolutions of the data instruments, (2) inherent spatial variability in SSM caused due to atmospheric and land surface controls, and (3) measurement errors caused due to imperfect retrievals of instruments. We present a data fusion scheme which takes all the above three factors into account using a Bayesian spatial hierarchical model (SHM), combining a geostatistical approach with a hierarchical model. The applicability of the fusion scheme is demonstrated by fusing point, airborne, and satellite data for a watershed exhibiting high spatial variability in Manitoba, Canada. We demonstrate that the proposed data fusion scheme is adept at assimilating and predicting SSM distribution across all three scales while accounting for potential measurement errors caused due to imperfect retrievals. Further validation of the algorithm is required in different hydroclimates and surface heterogeneity as well as for other data platforms for wider applicability.

     
    more » « less
  4. Abstract

    Extremely large telescopes (ELTs) present an unparalleled opportunity to study the magnetism, atmospheric dynamics, and chemistry of very-low-mass (VLM) stars, brown dwarfs, and exoplanets. Instruments such as the Giant Magellan Telescope–Consortium Large Earth Finder (GMT/GCLEF), the Thirty Meter Telescope’s Multi-Objective Diffraction-limited High-Resolution Infrared Spectrograph (TMT/MODHIS), and the European Southern Observatory’s Mid-Infrared ELT Imager and Spectrograph (ELT/METIS) provide the spectral resolution and signal-to-noise ratio necessary to Doppler image ultracool targets’ surfaces based on temporal spectral variations due to surface inhomogeneities. Using our publicly available code,Imber, developed and validated in Plummer & Wang, we evaluate these instruments’ abilities to discern magnetic starspots and cloud systems on a VLM star (TRAPPIST-1), two L/T transition ultracool dwarfs (VHS J1256−1257 b and SIMP J0136+0933), and three exoplanets (Beta Pic b and HR 8799 d and e). We find that TMT/MODHIS and ELT/METIS are suitable for Doppler imaging the ultracool dwarfs and Beta Pic b over a single rotation. Uncertainties for longitude and radius are typically ≲10°, and latitude uncertainties range from ∼10° to 30°. TRAPPIST-1's edge-on inclination and lowυsiniprovide a challenge for all three instruments, while GMT/GCLEF and the HR 8799 planets may require observations over multiple rotations. We compare the spectroscopic technique, photometry-only inference, and the combination of the two. We find combining spectroscopic and photometric observations can lead to improved Bayesian inference of surface inhomogeneities and offers insight into whether ultracool atmospheres are dominated by spotted or banded features.

     
    more » « less
  5. Abstract

    Near‐term ecological forecasts provide resource managers advance notice of changes in ecosystem services, such as fisheries stocks, timber yields, or water quality. Importantly, ecological forecasts can identify where there is uncertainty in the forecasting system, which is necessary to improve forecast skill and guide interpretation of forecast results. Uncertainty partitioning identifies the relative contributions to total forecast variance introduced by different sources, including specification of the model structure, errors in driver data, and estimation of current states (initial conditions). Uncertainty partitioning could be particularly useful in improving forecasts of highly variable cyanobacterial densities, which are difficult to predict and present a persistent challenge for lake managers. As cyanobacteria can produce toxic and unsightly surface scums, advance warning when cyanobacterial densities are increasing could help managers mitigate water quality issues. Here, we fit 13 Bayesian state‐space models to evaluate different hypotheses about cyanobacterial densities in a low nutrient lake that experiences sporadic surface scums of the toxin‐producing cyanobacterium,Gloeotrichia echinulata. We used data from several summers of weekly cyanobacteria samples to identify dominant sources of uncertainty for near‐term (1‐ to 4‐week) forecasts ofG. echinulatadensities. Water temperature was an important predictor of cyanobacterial densities during model fitting and at the 4‐week forecast horizon. However, no physical covariates improved model performance over a simple model including the previous week's densities in 1‐week‐ahead forecasts. Even the best fit models exhibited large variance in forecasted cyanobacterial densities and did not capture rare peak occurrences, indicating that significant explanatory variables when fitting models to historical data are not always effective for forecasting. Uncertainty partitioning revealed that model process specification and initial conditions dominated forecast uncertainty. These findings indicate that long‐term studies of different cyanobacterial life stages and movement in the water column as well as measurements of drivers relevant to different life stages could improve model process representation of cyanobacteria abundance. In addition, improved observation protocols could better define initial conditions and reduce spatial misalignment of environmental data and cyanobacteria observations. Our results emphasize the importance of ecological forecasting principles and uncertainty partitioning to refine and understand predictive capacity across ecosystems.

     
    more » « less