skip to main content

Title: Calibrating Cosmological Simulations with Implicit Likelihood Inference Using Galaxy Growth Observables

In a novel approach employing implicit likelihood inference (ILI), also known as likelihood-free inference, we calibrate the parameters of cosmological hydrodynamic simulations against observations, which has previously been unfeasible due to the high computational cost of these simulations. For computational efficiency, we train neural networks as emulators on ∼1000 cosmological simulations from the CAMELS project to estimate simulated observables, taking as input the cosmological and astrophysical parameters, and use these emulators as surrogates for the cosmological simulations. Using the cosmic star formation rate density (SFRD) and, separately, the stellar mass functions (SMFs) at different redshifts, we perform ILI on selected cosmological and astrophysical parameters (Ωm,σ8, stellar wind feedback, and kinetic black hole feedback) and obtain full six-dimensional posterior distributions. In the performance test, the ILI from the emulated SFRD (SMFs) can recover the target observables with a relative error of 0.17% (0.4%). We find that degeneracies exist between the parameters inferred from the emulated SFRD, confirmed with new full cosmological simulations. We also find that the SMFs can break the degeneracy in the SFRD, which indicates that the SMFs provide complementary constraints for the parameters. Further, we find that a parameter combination inferred from an observationally inferred SFRD reproduces the target observed SFRD very well, whereas, in the case of the SMFs, the inferred and observed SMFs show significant discrepancies that indicate potential limitations of the current galaxy formation modeling and calibration framework, and/or systematic differences and inconsistencies between observations of the SMFs.

more » « less
Award ID(s):
2108678 2108470 2108944
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ;
Publisher / Repository:
DOI PREFIX: 10.3847
Date Published:
Journal Name:
The Astrophysical Journal
Medium: X Size: Article No. 67
["Article No. 67"]
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    We present a new self-consistent semianalytic model of the first stars and galaxies to explore the high-redshift (z≥ 15) Population III (PopIII) and metal-enriched star formation histories. Our model includes the detailed merger history of dark matter halos generated with Monte Carlo merger trees. We calibrate the minimum halo mass for PopIII star formation from recent hydrodynamical cosmological simulations that simultaneously include the baryon–dark matter streaming velocity, Lyman–Werner (LW) feedback, and molecular hydrogen self-shielding. We find an overall increase in the resulting star formation rate density (SFRD) compared to calibrations based on previous simulations (e.g., the PopIII SFRD is over an order of magnitude higher atz= 35−15). We evaluate the effect of the halo-to-halo scatter in this critical mass and find that it increases the PopIII stellar mass density by a factor ∼1.5 atz≥ 15. Additionally, we assess the impact of various semianalytic/analytic prescriptions for halo assembly and star formation previously adopted in the literature. For example, we find that models assuming smooth halo growth computed via abundance matching predict SFRDs similar to the merger tree model for our fiducial model parameters, but that they may underestimate the PopIII SFRD in cases of strong LW feedback. Finally, we simulate subvolumes of the Universe with our model both to quantify the reduction in total star formation in numerical simulations due to a lack of density fluctuations on spatial scales larger than the simulation box, and to determine spatial fluctuations in SFRD due to the diversity in halo abundances and merger histories.

    more » « less
  2. Abstract Galaxies can be characterized by many internal properties such as stellar mass, gas metallicity, and star formation rate. We quantify the amount of cosmological and astrophysical information that the internal properties of individual galaxies and their host dark matter halos contain. We train neural networks using hundreds of thousands of galaxies from 2000 state-of-the-art hydrodynamic simulations with different cosmologies and astrophysical models of the CAMELS project to perform likelihood-free inference on the value of the cosmological and astrophysical parameters. We find that knowing the internal properties of a single galaxy allows our models to infer the value of Ω m , at fixed Ω b , with a ∼10% precision, while no constraint can be placed on σ 8 . Our results hold for any type of galaxy, central or satellite, massive or dwarf, at all considered redshifts, z ≤ 3, and they incorporate uncertainties in astrophysics as modeled in CAMELS. However, our models are not robust to changes in subgrid physics due to the large intrinsic differences the two considered models imprint on galaxy properties. We find that the stellar mass, stellar metallicity, and maximum circular velocity are among the most important galaxy properties to determine the value of Ω m . We believe that our results can be explained by considering that changes in the value of Ω m , or potentially Ω b /Ω m , affect the dark matter content of galaxies, which leaves a signature in galaxy properties distinct from the one induced by galactic processes. Our results suggest that the low-dimensional manifold hosting galaxy properties provides a tight direct link between cosmology and astrophysics. 
    more » « less

    Extracting information from the total matter power spectrum with the precision needed for upcoming cosmological surveys requires unraveling the complex effects of galaxy formation processes on the distribution of matter. We investigate the impact of baryonic physics on matter clustering at z = 0 using a library of power spectra from the Cosmology and Astrophysics with MachinE Learning Simulations project, containing thousands of $(25\, h^{-1}\, {\rm Mpc})^3$ volume realizations with varying cosmology, initial random field, stellar and active galactic nucleus (AGN) feedback strength and subgrid model implementation methods. We show that baryonic physics affects matter clustering on scales $k \gtrsim 0.4\, h\, \mathrm{Mpc}^{-1}$ and the magnitude of this effect is dependent on the details of the galaxy formation implementation and variations of cosmological and astrophysical parameters. Increasing AGN feedback strength decreases halo baryon fractions and yields stronger suppression of power relative to N-body simulations, while stronger stellar feedback often results in weaker effects by suppressing black hole growth and therefore the impact of AGN feedback. We find a broad correlation between mean baryon fraction of massive haloes (M200c > 1013.5 M⊙) and suppression of matter clustering but with significant scatter compared to previous work owing to wider exploration of feedback parameters and cosmic variance effects. We show that a random forest regressor trained on the baryon content and abundance of haloes across the full mass range 1010 ≤ Mhalo/M⊙<1015 can predict the effect of galaxy formation on the matter power spectrum on scales k = 1.0–20.0 $h\, \mathrm{Mpc}^{-1}$.

    more » « less

    We explore the assumption, widely used in many astrophysical calculations, that the stellar initial mass function (IMF) is universal across all galaxies. By considering both a canonical broken-power-law IMF and a non-universal IMF, we are able to compare the effect of different IMFs on multiple observables and derived quantities in astrophysics. Specifically, we consider a non-universal IMF that varies as a function of the local star formation rate, and explore the effects on the star formation rate density (SFRD), the extragalactic background light, the supernova (both core-collapse and thermonuclear) rates, and the diffuse supernova neutrino background. Our most interesting result is that our adopted varying IMF leads to much greater uncertainty on the SFRD at $z \approx 2-4$ than is usually assumed. Indeed, we find an SFRD (inferred using observed galaxy luminosity distributions) that is a factor of $\gtrsim 3$ lower than canonical results obtained using a universal IMF. Secondly, the non-universal IMF we explore implies a reduction in the supernova core-collapse rate of a factor of $\sim 2$, compared against a universal IMF. The other potential tracers are only slightly affected by changes to the properties of the IMF. We find that currently available data do not provide a clear preference for universal or non-universal IMF. However, improvements to measurements of the star formation rate and core-collapse supernova rate at redshifts $z \gtrsim 2$ may offer the best prospects for discernment.

    more » « less
  5. Abstract

    We present CAMELS-ASTRID, the third suite of hydrodynamical simulations in the Cosmology and Astrophysics with MachinE Learning (CAMELS) project, along with new simulation sets that extend the model parameter space based on the previous frameworks of CAMELS-TNG and CAMELS-SIMBA, to provide broader training sets and testing grounds for machine-learning algorithms designed for cosmological studies. CAMELS-ASTRID employs the galaxy formation model following the ASTRID simulation and contains 2124 hydrodynamic simulation runs that vary three cosmological parameters (Ωm,σ8, Ωb) and four parameters controlling stellar and active galactic nucleus (AGN) feedback. Compared to the existing TNG and SIMBA simulation suites in CAMELS, the fiducial model of ASTRID features the mildest AGN feedback and predicts the least baryonic effect on the matter power spectrum. The training set of ASTRID covers a broader variation in the galaxy populations and the baryonic impact on the matter power spectrum compared to its TNG and SIMBA counterparts, which can make machine-learning models trained on the ASTRID suite exhibit better extrapolation performance when tested on other hydrodynamic simulation sets. We also introduce extension simulation sets in CAMELS that widely explore 28 parameters in the TNG and SIMBA models, demonstrating the enormity of the overall galaxy formation model parameter space and the complex nonlinear interplay between cosmology and astrophysical processes. With the new simulation suites, we show that building robust machine-learning models favors training and testing on the largest possible diversity of galaxy formation models. We also demonstrate that it is possible to train accurate neural networks to infer cosmological parameters using the high-dimensional TNG-SB28 simulation set.

    more » « less