skip to main content


Title: Prediction of galaxy halo masses in SDSS DR7 via a machine learning approach
ABSTRACT

We present a machine learning (ML) approach for the prediction of galaxies’ dark matter halo masses which achieves an improved performance over conventional methods. We train three ML algorithms (XGBoost, random forests, and neural network) to predict halo masses using a set of synthetic galaxy catalogues that are built by populating dark matter haloes in N-body simulations with galaxies and that match both the clustering and the joint distributions of properties of galaxies in the Sloan Digital Sky Survey (SDSS). We explore the correlation of different galaxy- and group-related properties with halo mass, and extract the set of nine features that contribute the most to the prediction of halo mass. We find that mass predictions from the ML algorithms are more accurate than those from halo abundance matching (HAM) or dynamical mass estimates (DYN). Since the danger of this approach is that our training data might not accurately represent the real Universe, we explore the effect of testing the model on synthetic catalogues built with different assumptions than the ones used in the training phase. We test a variety of models with different ways of populating dark matter haloes, such as adding velocity bias for satellite galaxies. We determine that, though training and testing on different data can lead to systematic errors in predicted masses, the ML approach still yields substantially better masses than either HAM or DYN. Finally, we apply the trained model to a galaxy and group catalogue from the SDSS DR7 and present the resulting halo masses.

 
more » « less
NSF-PAR ID:
10121821
Author(s) / Creator(s):
 ;  
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Monthly Notices of the Royal Astronomical Society
Volume:
490
Issue:
2
ISSN:
0035-8711
Page Range / eLocation ID:
p. 2367-2379
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. ABSTRACT

    We select a volume-limited sample of galaxies derived from the SDSS DR7 to study the environment of low surface brightness (LSB) galaxies at different scales, as well as several physical properties of the dark matter haloes where the LSB galaxies of the sample are embedded. To characterize the environment, we make use of a number of publicly available value-added galaxy catalogues. We find a slight preference for LSB galaxies to be found in filaments instead of clusters, with their mean distance to the nearest filament typically larger than for high surface brightness (HSB) galaxies. The fraction of isolated central LSB galaxies is higher than the same fraction for HSB ones, and the density of their local environment lower. The stellar-to-halo mass ratio using four different estimates is up to ∼20 per cent for HSB galaxies. LSB central galaxies present more recent assembly times when compared with their HSB counterparts. Regarding the λ spin parameter, using six different proxies for its estimation, we find that LSB galaxies present systematically larger values of λ than the HSB galaxy sample, and constructing a control sample with direct kinematic information drawn from ALFALFA, we confirm that the spin parameter of LSB galaxies is 1.6–2 times larger than the one estimated for their HSB counterparts.

     
    more » « less
  2. ABSTRACT We explore the isothermal total density profiles of early-type galaxies (ETGs) in the IllustrisTNG simulation. For the selected 559 ETGs at z = 0 with stellar masses $10^{10.7}\, \mathrm{M}_{\odot } \leqslant M_{\ast } \leqslant 10^{11.9}\, \mathrm{M}_{\odot }$, the total power-law slope has a mean of 〈γ′〉 = 2.011 ± 0.007 and a scatter of $\sigma _{\gamma ^{\prime }} = 0.171$ over the radial range 0.4–4 times the stellar half-mass radius. Several correlations between γ′ and galactic properties including stellar mass, effective radius, stellar surface density, central velocity dispersion, central dark matter fraction, and in situ-formed stellar mass ratio are compared to observations and other simulations, revealing that IllustrisTNG reproduces many correlation trends, and in particular, γ′ is almost constant with redshift below z = 2. Through analysing IllustrisTNG model variations, we show that black hole kinetic winds are crucial to lowering γ′ and matching observed galaxy correlations. The effects of stellar winds on γ′ are subdominant compared to active galactic nucleus (AGN) feedback, and differ due to the presence of AGN feedback from previous works. The density profiles of the ETG dark matter haloes are well described by steeper than NFW profiles, and they are steeper in the full physics (FP) run than their counterparts in the dark matter-only (DMO) run. Their inner density slopes anticorrelate (remain constant) with the halo mass in the FP (DMO) run, and anticorrelate with the halo concentration parameter c200 in both the types of runs. The dark matter haloes of low-mass ETGs are contracted whereas high-mass ETGs are expanded, suggesting that variations in the total density profile occur through the different halo responses to baryons. 
    more » « less
  3. ABSTRACT Galaxy–galaxy lensing is a powerful probe of the connection between galaxies and their host dark matter haloes, which is important both for galaxy evolution and cosmology. We extend the measurement and modelling of the galaxy–galaxy lensing signal in the recent Dark Energy Survey Year 3 cosmology analysis to the highly non-linear scales (∼100 kpc). This extension enables us to study the galaxy–halo connection via a Halo Occupation Distribution (HOD) framework for the two lens samples used in the cosmology analysis: a luminous red galaxy sample (redmagic) and a magnitude-limited galaxy sample (maglim). We find that redmagic (maglim) galaxies typically live in dark matter haloes of mass log10(Mh/M⊙) ≈ 13.7 which is roughly constant over redshift (13.3−13.5 depending on redshift). We constrain these masses to ${\sim}15{{\ \rm per\ cent}}$, approximately 1.5 times improvement over the previous work. We also constrain the linear galaxy bias more than five times better than what is inferred by the cosmological scales only. We find the satellite fraction for redmagic (maglim) to be ∼0.1−0.2 (0.1−0.3) with no clear trend in redshift. Our constraints on these halo properties are broadly consistent with other available estimates from previous work, large-scale constraints, and simulations. The framework built in this paper will be used for future HOD studies with other galaxy samples and extensions for cosmological analyses. 
    more » « less
  4. ABSTRACT

    We present a per cent-level accurate model of the line-of-sight velocity distribution of galaxies around dark matter haloes as a function of projected radius and halo mass. The model is developed and tested using synthetic galaxy catalogues generated with the UniverseMachine run on the Multi-Dark Planck 2 N-body simulations. The model decomposes the galaxies around a cluster into three kinematically distinct classes: orbiting, infalling, and interloping galaxies. We demonstrate that: (1) we can statistically distinguish between these three types of galaxies using only projected line-of-sight velocity information; (2) the halo edge radius inferred from the line-of-sight velocity dispersion is an excellent proxy for the three-dimensional halo edge radius; and (3) we can accurately recover the full velocity dispersion profile for each of the three populations of galaxies. Importantly, the velocity dispersion profiles of the orbiting and infalling galaxies contain five independent parameters – three distinct radial scales and two velocity dispersion amplitudes – each of which is correlated with mass. Thus, the velocity dispersion profile of galaxy clusters has inherent redundancies that allow us to perform non-trivial systematics checks from a single data set. We discuss several potential applications of our new model for detecting the edge radius and constraining cosmology and astrophysics using upcoming spectroscopic surveys.

     
    more » « less
  5. ABSTRACT

    Galaxy cluster masses, rich with cosmological information, can be estimated from internal dark matter (DM) velocity dispersions, which in turn can be observationally inferred from satellite galaxy velocities. However, galaxies are biased tracers of the DM, and the bias can vary over host halo and galaxy properties as well as time. We precisely calibrate the velocity bias, bv – defined as the ratio of galaxy and DM velocity dispersions – as a function of redshift, host halo mass, and galaxy stellar mass threshold ($M_{\rm \star , sat}$), for massive haloes ($M_{\rm 200c}\gt 10^{13.5} \, {\rm M}_\odot$) from five cosmological simulations: IllustrisTNG, Magneticum, Bahamas + Macsis, The Three Hundred Project, and MultiDark Planck-2. We first compare scaling relations for galaxy and DM velocity dispersion across simulations; the former is estimated using a new ensemble velocity likelihood method that is unbiased for low galaxy counts per halo, while the latter uses a local linear regression. The simulations show consistent trends of bv increasing with M200c and decreasing with redshift and $M_{\rm \star , sat}$. The ensemble-estimated theoretical uncertainty in bv is 2–3 per cent, but becomes percent-level when considering only the three highest resolution simulations. We update the mass–richness normalization for an SDSS redMaPPer cluster sample, and find our improved bv estimates reduce the normalization uncertainty from 22 to 8 per cent, demonstrating that dynamical mass estimation is competitive with weak lensing mass estimation. We discuss necessary steps for further improving this precision. Our estimates for $b_v(M_{\rm 200c}, M_{\rm \star , sat}, z)$ are made publicly available.

     
    more » « less