skip to main content

Title: Revealing the Galaxy–Halo Connection through Machine Learning

Understanding the connections between galaxy stellar mass, star formation rate, and dark matter halo mass represents a key goal of the theory of galaxy formation. Cosmological simulations that include hydrodynamics, physical treatments of star formation, feedback from supernovae, and the radiative transfer of ionizing photons can capture the processes relevant for establishing these connections. The complexity of these physics can prove difficult to disentangle and obfuscate how mass-dependent trends in the galaxy population originate. Here, we train a machine-learning method called Explainable Boosting Machines (EBMs) to infer how the stellar mass and star formation rate of nearly 6 million galaxies simulated by the Cosmic Reionization on Computers project depend on the physical properties of halo mass, the peak circular velocity of the galaxy during its formation historyvpeak, cosmic environment, and redshift. The resulting EBM models reveal the relative importance of these properties in setting galaxy stellar mass and star formation rate, withvpeakproviding the most dominant contribution. Environmental properties provide substantial improvements for modeling the stellar mass and star formation rate in only ≲10% of the simulated galaxies. We also provide alternative formulations of EBM models that enable low-resolution simulations, which cannot track the interior structure of dark matter halos, to predict the stellar mass and star formation rate of galaxies computed by high-resolution simulations with detailed baryonic physics.

more » « less
Author(s) / Creator(s):
; ; ; ; ; ; ;
Publisher / Repository:
DOI PREFIX: 10.3847
Date Published:
Journal Name:
The Astrophysical Journal
Medium: X Size: Article No. 122
["Article No. 122"]
Sponsoring Org:
National Science Foundation
More Like this

    We present Diffstar , a smooth parametric model for the in situ star formation history (SFH) of galaxies. The Diffstar model is distinct from traditional SFH models because it is parametrized directly in terms of basic features of galaxy formation physics. Diffstar includes ingredients for: the halo mass assembly history; the accretion of gas into the dark matter halo; the fraction of gas that is eventually transformed into stars, ϵms; the time-scale over which this transformation occurs, τcons; and the possibility that some galaxies will experience a quenching event at time tq, and may subsequently experience rejuvenated star formation. We show that our model is sufficiently flexible to describe the average stellar mass histories of galaxies in both the IllustrisTNG (TNG) and UniverseMachine (UM) simulations with an accuracy of ∼0.1 dex across most of cosmic time. We use Diffstar to compare TNG to UM in common physical terms, finding that: (i) star formation in UM is less efficient and burstier relative to TNG; (ii) UM galaxies have longer gas consumption time-scales, relative to TNG; (iii) rejuvenated star formation is ubiquitous in UM, whereas quenched TNG galaxies rarely experience sustained rejuvenation; and (iv) in both simulations, the distributions of ϵms, τcons, and tq share a common characteristic dependence upon halo mass, and present significant correlations with halo assembly history. We conclude with a discussion of how Diffstar can be used in future applications to fit the SEDs of individual observed galaxies, as well as in forward-modelling applications that populate cosmological simulations with synthetic galaxies.

    more » « less
  2. ABSTRACT We introduce a suite of cosmological volume simulations to study the evolution of galaxies as part of the Feedback in Realistic Environments project. FIREbox, the principal simulation of the present suite, provides a representative sample of galaxies (∼1000 galaxies with $M_{\rm star}\gt 10^8\, M_\odot$ at z  = 0) at a resolution ($\Delta {}x\sim {}20\, {\rm pc}$ , $m_{\rm b}\sim {}6\times {}10^4\, M_\odot$ ) comparable to state-of-the-art galaxy zoom-in simulations. FIREbox captures the multiphase nature of the interstellar medium in a fully cosmological setting (L = 22.1 Mpc) thanks to its exceptionally high dynamic range (≳106) and the inclusion of multichannel stellar feedback. Here, we focus on validating the simulation predictions by comparing to observational data. We find that star formation rates, gas masses, and metallicities of simulated galaxies with $M_{\rm star}\lt 10^{10.5-11}\, M_\odot$ broadly agree with observations. These galaxy scaling relations extend to low masses ($M_{\rm star}\sim {}10^7\, M_\odot$ ) and follow a (broken) power-law relationship. Also reproduced are the evolution of the cosmic HI density and the HI column density distribution at z ∼ 0–5. At low z , FIREbox predicts a peak in the stellar-mass–halo-mass relation but also a higher abundance of massive galaxies and a higher cosmic star formation rate density than observed, showing that stellar feedback alone is insufficient to reproduce the properties of massive galaxies at late times. Given its high resolution and sample size, FIREbox offers a baseline prediction of galaxy formation theory in a ΛCDM Universe while also highlighting modelling challenges to be addressed in next-generation galaxy simulations. 
    more » « less

    The physics of Cosmic ray (CR) transport remains a key uncertainty in assessing whether CRs can produce galaxy-scale outflows consistent with observations. In this paper, we elucidate the physics of CR-driven galactic winds for CR transport dominated by diffusion. A companion paper considers CR streaming. We use analytic estimates validated by time-dependent spherically symmetric simulations to derive expressions for the mass-loss rate, momentum flux, and speed of CR-driven galactic winds, suitable for cosmological-scale or semi-analytic models of galaxy formation. For CR diffusion coefficients κ ≳ r0ci, where r0 is the base radius of the wind and ci is the isothermal gas sound speed, the asymptotic wind energy flux is comparable to that supplied to CRs, and the outflow rapidly accelerates to supersonic speeds. By contrast, for κ ≲ r0ci, CR-driven winds accelerate more slowly and lose most of their energy to gravity, a CR analogue of photon-tired stellar winds. Given CR diffusion coefficients estimated using Fermi gamma-ray observations of pion decay, we predict mass-loss rates in CR-driven galactic winds of the order of the star formation rate for dwarf and disc galaxies. The dwarf galaxy mass-loss rates are small compared to the mass-loadings needed to reconcile the stellar and dark matter halo mass functions. For nuclear starbursts (e.g. M82, Arp 220), CR diffusion and pion losses suppress the CR pressure in the galaxy and the strength of CR-driven winds. We discuss the implications of our results for interpreting observations of galactic winds and for the role of CRs in galaxy formation.

    more » « less
  4. Abstract We describe a public data release of the FIRE-2 cosmological zoom-in simulations of galaxy formation (available at ) from the Feedback In Realistic Environments (FIRE) project. FIRE-2 simulations achieve parsec-scale resolution to explicitly model the multiphase interstellar medium while implementing direct models for stellar evolution and feedback, including stellar winds, core-collapse and Type Ia supernovae, radiation pressure, photoionization, and photoelectric heating. We release complete snapshots from three suites of simulations. The first comprises 20 simulations that zoom in on 14 Milky Way (MW)–mass galaxies, five SMC/LMC-mass galaxies, and four lower-mass galaxies including one ultrafaint; we release 39 snapshots across z = 0–10. The second comprises four massive galaxies, with 19 snapshots across z = 1–10. Finally, a high-redshift suite comprises 22 simulations, with 11 snapshots across z = 5–10. Each simulation also includes dozens of resolved lower-mass (satellite) galaxies in its zoom-in region. Snapshots include all stored properties for all dark matter, gas, and star particles, including 11 elemental abundances for stars and gas, and formation times (ages) of star particles. We also release accompanying (sub)halo catalogs, which include galaxy properties and member star particles. For the simulations to z = 0, including all MW-mass galaxies, we release the formation coordinates and an “ex situ” flag for all star particles, pointers to track particles across snapshots, catalogs of stellar streams, and multipole basis expansions for the halo mass distributions. We describe publicly available python packages for reading and analyzing these simulations. 
    more » « less
  5. Abstract We predict the stellar mass–halo mass (SMHM) relationship for dwarf galaxies, using simulated galaxies with peak halo masses of M peak = 10 11 M ⊙ down into the ultra-faint dwarf range to M peak = 10 7 M ⊙ . Our simulated dwarfs have stellar masses of M star = 790 M ⊙ to 8.2 × 10 8 M ⊙ , with corresponding V -band magnitudes from −2 to −18.5. For M peak > 10 10 M ⊙ , the simulated SMHM relationship agrees with literature determinations, including exhibiting a small scatter of 0.3 dex. However, the scatter in the SMHM relation increases for lower-mass halos. We first present results for well-resolved halos that contain a simulated stellar population, but recognize that whether a halo hosts a galaxy is inherently mass resolution dependent. We thus adopt a probabilistic model to populate “dark” halos below our resolution limit to predict an “intrinsic” slope and scatter for the SMHM relation. We fit linearly growing log-normal scatter in stellar mass, which grows to more than 1 dex at M peak = 10 8 M ⊙ . At the faintest end of the SMHM relation probed by our simulations, a galaxy cannot be assigned a unique halo mass based solely on its luminosity. Instead, we provide a formula to stochastically populate low-mass halos following our results. Finally, we show that our growing log-normal scatter steepens the faint-end slope of the predicted stellar mass function. 
    more » « less