skip to main content

This content will become publicly available on March 1, 2023

Title: Finding Universal Relations in Subhalo Properties with Artificial Intelligence
Abstract We use a generic formalism designed to search for relations in high-dimensional spaces to determine if the total mass of a subhalo can be predicted from other internal properties such as velocity dispersion, radius, or star formation rate. We train neural networks using data from the Cosmology and Astrophysics with MachinE Learning Simulations project and show that the model can predict the total mass of a subhalo with high accuracy: more than 99% of the subhalos have a predicted mass within 0.2 dex of their true value. The networks exhibit surprising extrapolation properties, being able to accurately predict the total mass of any type of subhalo containing any kind of galaxy at any redshift from simulations with different cosmologies, astrophysics models, subgrid physics, volumes, and resolutions, indicating that the network may have found a universal relation. We then use different methods to find equations that approximate the relation found by the networks and derive new analytic expressions that predict the total mass of a subhalo from its radius, velocity dispersion, and maximum circular velocity. We show that in some regimes, the analytic expressions are more accurate than the neural networks. The relation found by the neural network and approximated more » by the analytic equation bear similarities to the virial theorem. « less
; ; ; ; ; ; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
The Astrophysical Journal
Page Range or eLocation-ID:
Sponsoring Org:
National Science Foundation
More Like this

    We apply machine learning (ML), a powerful method for uncovering complex correlations in high-dimensional data, to the galaxy–halo connection of cosmological hydrodynamical simulations. The mapping between galaxy and halo variables is stochastic in the absence of perfect information, but conventional ML models are deterministic and hence cannot capture its intrinsic scatter. To overcome this limitation, we design an ensemble of neural networks with a Gaussian loss function that predict probability distributions, allowing us to model statistical uncertainties in the galaxy–halo connection as well as its best-fitting trends. We extract a number of galaxy and halo variables from the Horizon-AGN and IllustrisTNG100-1 simulations and quantify the extent to which knowledge of some subset of one enables prediction of the other. This allows us to identify the key features of the galaxy–halo connection and investigate the origin of its scatter in various projections. We find that while halo properties beyond mass account for up to 50 per cent of the scatter in the halo-to-stellar mass relation, the prediction of stellar half-mass radius or total gas mass is not substantially improved by adding further halo properties. We also use these results to investigate semi-analytic models for galaxy size in the two simulations, finding thatmore »assumptions relating galaxy size to halo size or spin are not successful.

    « less
  2. Abstract We use IllustrisTNG simulations to explore the dynamic scaling relation between massive clusters and their—central—brightest cluster galaxies (BCGs). The IllustrisTNG-300 simulation we use includes 280 massive clusters from the z = 0 snapshot with M 200 > 10 14 M ⊙ , enabling a robust statistical analysis. We derive the line-of-sight velocity dispersion of the stellar particles of the BCGs ( σ *,BCG ), analogous to the observed BCG stellar velocity dispersion. We also compute the subhalo velocity dispersion to measure the cluster velocity dispersion ( σ cl ). Both σ *,BCG and σ cl are proportional to the cluster halo mass, but the slopes differ slightly. Thus, like the observed relation, σ *,BCG / σ cl declines as a function of σ cl , but the scatter is large. We explore the redshift evolution of the σ *,BCG − σ cl scaling relation for z ≲ 1 in a way that can be compared directly with observations. The scaling relation has a similar slope at high redshift, but the scatter increases because of the large scatter in σ *,BCG . The simulations imply that high-redshift BCGs are dynamically more complex than their low-redshift counterparts.

    We present a machine learning (ML) approach for the prediction of galaxies’ dark matter halo masses which achieves an improved performance over conventional methods. We train three ML algorithms (XGBoost, random forests, and neural network) to predict halo masses using a set of synthetic galaxy catalogues that are built by populating dark matter haloes in N-body simulations with galaxies and that match both the clustering and the joint distributions of properties of galaxies in the Sloan Digital Sky Survey (SDSS). We explore the correlation of different galaxy- and group-related properties with halo mass, and extract the set of nine features that contribute the most to the prediction of halo mass. We find that mass predictions from the ML algorithms are more accurate than those from halo abundance matching (HAM) or dynamical mass estimates (DYN). Since the danger of this approach is that our training data might not accurately represent the real Universe, we explore the effect of testing the model on synthetic catalogues built with different assumptions than the ones used in the training phase. We test a variety of models with different ways of populating dark matter haloes, such as adding velocity bias for satellite galaxies. We determinemore »that, though training and testing on different data can lead to systematic errors in predicted masses, the ML approach still yields substantially better masses than either HAM or DYN. Finally, we apply the trained model to a galaxy and group catalogue from the SDSS DR7 and present the resulting halo masses.

    « less
  4. Abstract Galaxies can be characterized by many internal properties such as stellar mass, gas metallicity, and star formation rate. We quantify the amount of cosmological and astrophysical information that the internal properties of individual galaxies and their host dark matter halos contain. We train neural networks using hundreds of thousands of galaxies from 2000 state-of-the-art hydrodynamic simulations with different cosmologies and astrophysical models of the CAMELS project to perform likelihood-free inference on the value of the cosmological and astrophysical parameters. We find that knowing the internal properties of a single galaxy allows our models to infer the value of Ω m , at fixed Ω b , with a ∼10% precision, while no constraint can be placed on σ 8 . Our results hold for any type of galaxy, central or satellite, massive or dwarf, at all considered redshifts, z ≤ 3, and they incorporate uncertainties in astrophysics as modeled in CAMELS. However, our models are not robust to changes in subgrid physics due to the large intrinsic differences the two considered models imprint on galaxy properties. We find that the stellar mass, stellar metallicity, and maximum circular velocity are among the most important galaxy properties to determine the valuemore »of Ω m . We believe that our results can be explained by considering that changes in the value of Ω m , or potentially Ω b /Ω m , affect the dark matter content of galaxies, which leaves a signature in galaxy properties distinct from the one induced by galactic processes. Our results suggest that the low-dimensional manifold hosting galaxy properties provides a tight direct link between cosmology and astrophysics.« less
  5. ABSTRACT Pressure balance plays a central role in models of the interstellar medium (ISM), but whether and how pressure balance is realized in a realistic multiphase ISM is not yet well understood. We address this question by using a set of FIRE-2 cosmological zoom-in simulations of Milky Way-mass disc galaxies, in which a multiphase ISM is self-consistently shaped by gravity, cooling, and stellar feedback. We analyse how gravity determines the vertical pressure profile as well as how the total ISM pressure is partitioned between different phases and components (thermal, dispersion/turbulence, and bulk flows). We show that, on average and consistent with previous more idealized simulations, the total ISM pressure balances the weight of the overlying gas. Deviations from vertical pressure balance increase with increasing galactocentric radius and with decreasing averaging scale. The different phases are in rough total pressure equilibrium with one another, but with large deviations from thermal pressure equilibrium owing to kinetic support in the cold and warm phases, which dominate the total pressure near the mid-plane. Bulk flows (e.g. inflows and fountains) are important at a few disc scale heights, while thermal pressure from hot gas dominates at larger heights. Overall, the total mid-plane pressure is well-predictedmore »by the weight of the disc gas and we show that it also scales linearly with the star formation rate surface density (ΣSFR). These results support the notion that the Kennicutt–Schmidt relation arises because ΣSFR and the gas surface density (Σg) are connected via the ISM mid-plane pressure.« less