skip to main content


Title: Augmenting astrophysical scaling relations with machine learning: Application to reducing the Sunyaev–Zeldovich flux–mass scatter
Complex astrophysical systems often exhibit low-scatter relations between observable properties (e.g., luminosity, velocity dispersion, oscillation period). These scaling relations illuminate the underlying physics, and can provide observational tools for estimating masses and distances. Machine learning can provide a fast and systematic way to search for new scaling relations (or for simple extensions to existing relations) in abstract high-dimensional parameter spaces. We use a machine learning tool called symbolic regression (SR), which models patterns in a dataset in the form of analytic equations. We focus on the Sunyaev-Zeldovich flux−cluster mass relation ( Y SZ − M ), the scatter in which affects inference of cosmological parameters from cluster abundance data. Using SR on the data from the IllustrisTNG hydrodynamical simulation, we find a new proxy for cluster mass which combines Y SZ and concentration of ionized gas ( c gas ): M ∝ Y conc 3/5 ≡ Y SZ 3/5 (1 − A c gas ). Y conc reduces the scatter in the predicted M by ∼20 − 30% for large clusters ( M ≳ 10 14 h −1 M ⊙ ), as compared to using just Y SZ . We show that the dependence on c gas is linked to cores of clusters exhibiting larger scatter than their outskirts. Finally, we test Y conc on clusters from CAMELS simulations and show that Y conc is robust against variations in cosmology, subgrid physics, and cosmic variance. Our results and methodology can be useful for accurate multiwavelength cluster mass estimation from upcoming CMB and X-ray surveys like ACT, SO, eROSITA and CMB-S4.  more » « less
Award ID(s):
2108944
NSF-PAR ID:
10458998
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Proceedings of the National Academy of Sciences
Volume:
120
Issue:
12
ISSN:
0027-8424
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. ABSTRACT

    Feedback from active galactic nuclei (AGNs) and supernovae can affect measurements of integrated Sunyaev–Zeldovich (SZ) flux of haloes (YSZ) from cosmic microwave background (CMB) surveys, and cause its relation with the halo mass (YSZ–M) to deviate from the self-similar power-law prediction of the virial theorem. We perform a comprehensive study of such deviations using CAMELS, a suite of hydrodynamic simulations with extensive variations in feedback prescriptions. We use a combination of two machine learning tools (random forest and symbolic regression) to search for analogues of the Y–M relation which are more robust to feedback processes for low masses ($M\lesssim 10^{14}\, \mathrm{ h}^{-1} \, \mathrm{ M}_\odot$); we find that simply replacing Y → Y(1 + M*/Mgas) in the relation makes it remarkably self-similar. This could serve as a robust multiwavelength mass proxy for low-mass clusters and galaxy groups. Our methodology can also be generally useful to improve the domain of validity of other astrophysical scaling relations. We also forecast that measurements of the Y–M relation could provide per cent level constraints on certain combinations of feedback parameters and/or rule out a major part of the parameter space of supernova and AGN feedback models used in current state-of-the-art hydrodynamic simulations. Our results can be useful for using upcoming SZ surveys (e.g. SO, CMB-S4) and galaxy surveys (e.g. DESI and Rubin) to constrain the nature of baryonic feedback. Finally, we find that the alternative relation, Y–M*, provides complementary information on feedback than Y–M.

     
    more » « less
  2. ABSTRACT

    Cosmological simulations are an important theoretical pillar for understanding non-linear structure formation in our Universe and for relating it to observations on large scales. In several papers, we introduce our MillenniumTNG (MTNG) project that provides a comprehensive set of high-resolution, large-volume simulations of cosmic structure formation aiming to better understand physical processes on large scales and to help interpret upcoming large-scale galaxy surveys. We here focus on the full physics box MTNG740 that computes a volume of $740\, \mathrm{Mpc}^3$ with a baryonic mass resolution of $3.1\times ~10^7\, \mathrm{M_\odot }$ using arepo with 80.6 billion cells and the IllustrisTNG galaxy formation model. We verify that the galaxy properties produced by MTNG740 are consistent with the TNG simulations, including more recent observations. We focus on galaxy clusters and analyse cluster scaling relations and radial profiles. We show that both are broadly consistent with various observational constraints. We demonstrate that the SZ-signal on a deep light-cone is consistent with Planck limits. Finally, we compare MTNG740 clusters with galaxy clusters found in Planck and the SDSS-8 RedMaPPer richness catalogue in observational space, finding very good agreement as well. However, simultaneously matching cluster masses, richness, and Compton-y requires us to assume that the SZ mass estimates for Planck clusters are underestimated by 0.2 dex on average. Due to its unprecedented volume for a high-resolution hydrodynamical calculation, the MTNG740 simulation offers rich possibilities to study baryons in galaxies, galaxy clusters, and in large-scale structure, and in particular their impact on upcoming large cosmological surveys.

     
    more » « less
  3. null (Ed.)
    ABSTRACT We perform a cross validation of the cluster catalogue selected by the red-sequence Matched-filter Probabilistic Percolation algorithm (redMaPPer) in Dark Energy Survey year 1 (DES-Y1) data by matching it with the Sunyaev–Zel’dovich effect (SZE) selected cluster catalogue from the South Pole Telescope SPT-SZ survey. Of the 1005 redMaPPer selected clusters with measured richness $\hat{\lambda }\gt 40$ in the joint footprint, 207 are confirmed by SPT-SZ. Using the mass information from the SZE signal, we calibrate the richness–mass relation using a Bayesian cluster population model. We find a mass trend λ ∝ MB consistent with a linear relation (B ∼ 1), no significant redshift evolution and an intrinsic scatter in richness of σλ = 0.22 ± 0.06. By considering two error models, we explore the impact of projection effects on the richness–mass modelling, confirming that such effects are not detectable at the current level of systematic uncertainties. At low richness SPT-SZ confirms fewer redMaPPer clusters than expected. We interpret this richness dependent deficit in confirmed systems as due to the increased presence at low richness of low-mass objects not correctly accounted for by our richness-mass scatter model, which we call contaminants. At a richness $\hat{\lambda }=40$, this population makes up ${\gt}12{{\ \rm per\ cent}}$ (97.5 percentile) of the total population. Extrapolating this to a measured richness $\hat{\lambda }=20$ yields ${\gt}22{{\ \rm per\ cent}}$ (97.5 percentile). With these contamination fractions, the predicted redMaPPer number counts in different plausible cosmologies are compatible with the measured abundance. The presence of such a population is also a plausible explanation for the different mass trends (B ∼ 0.75) obtained from mass calibration using purely optically selected clusters. The mean mass from stacked weak lensing (WL) measurements suggests that these low-mass contaminants are galaxy groups with masses ∼3–5 × 1013 M⊙ which are beyond the sensitivity of current SZE and X-ray surveys but a natural target for SPT-3G and eROSITA. 
    more » « less
  4. ABSTRACT

    We present updated cosmological constraints from measurements of the gas mass fractions (fgas) of massive, dynamically relaxed galaxy clusters. Our new data set has greater leverage on models of dark energy, thanks to the addition of the Perseus cluster at low redshifts, two new clusters at redshifts z ≳ 1, and significantly longer observations of four clusters at 0.6 < z < 0.9. Our low-redshift (z < 0.16) fgas data, combined with the cosmic baryon fraction measured from the cosmic microwave background (CMB), imply a Hubble constant of h = 0.722 ± 0.067. Combining the full fgas data set with priors on the cosmic baryon density and the Hubble constant, we constrain the dark energy density to be ΩΛ = 0.865 ± 0.119 in non-flat Lambda cold dark matter (cosmological constant) models, and its equation of state to be $w=-1.13_{-0.20}^{+0.17}$ in flat, constant-w models, respectively 41 per cent and 29 per cent tighter than our previous work, and comparable to the best constraints available from other probes. Combining fgas, CMB, supernova, and baryon acoustic oscillation data, we also constrain models with global curvature and evolving dark energy. For the massive, relaxed clusters employed here, we find the scaling of fgas with mass to be consistent with a constant, with an intrinsic scatter that corresponds to just ∼3 per cent in distance.

     
    more » « less
  5. Context. Galaxy clusters are an important tool for cosmology, and their detection and characterization are key goals for current and future surveys. Using data from the Wide-field Infrared Survey Explorer (WISE), the Massive and Distant Clusters of WISE Survey (MaDCoWS) located 2839 significant galaxy overdensities at redshifts 0.7 ≲  z  ≲ 1.5, which included extensive follow-up imaging from the Spitzer Space Telescope to determine cluster richnesses. Concurrently, the Atacama Cosmology Telescope (ACT) has produced large area millimeter-wave maps in three frequency bands along with a large catalog of Sunyaev-Zeldovich (SZ)-selected clusters as part of its Data Release 5 (DR5). Aims. We aim to verify and characterize MaDCoWS clusters using measurements of, or limits on, their thermal SZ effect signatures. We also use these detections to establish the scaling relation between SZ mass and the MaDCoWS-defined richness. Methods. Using the maps and cluster catalog from DR5, we explore the scaling between SZ mass and cluster richness. We do this by comparing cataloged detections and extracting individual and stacked SZ signals from the MaDCoWS cluster locations. We use complementary radio survey data from the Very Large Array, submillimeter data from Herschel , and ACT 224 GHz data to assess the impact of contaminating sources on the SZ signals from both ACT and MaDCoWS clusters. We use a hierarchical Bayesian model to fit the mass-richness scaling relation, allowing for clusters to be drawn from two populations: one, a Gaussian centered on the mass-richness relation, and the other, a Gaussian centered on zero SZ signal. Results. We find that MaDCoWS clusters have submillimeter contamination that is consistent with a gray-body spectrum, while the ACT clusters are consistent with no submillimeter emission on average. Additionally, the intrinsic radio intensities of ACT clusters are lower than those of MaDCoWS clusters, even when the ACT clusters are restricted to the same redshift range as the MaDCoWS clusters. We find the best-fit ACT SZ mass versus MaDCoWS richness scaling relation has a slope of p 1 = 1.84 −0.14 +0.15 , where the slope is defined as M λ ∝ 15 p 1 and λ 15 is the richness. We also find that the ACT SZ signals for a significant fraction (∼57%) of the MaDCoWS sample can statistically be described as being drawn from a noise-like distribution, indicating that the candidates are possibly dominated by low-mass and unvirialized systems that are below the mass limit of the ACT sample. Further, we note that a large portion of the optically confirmed ACT clusters located in the same volume of the sky as MaDCoWS are not selected by MaDCoWS, indicating that the MaDCoWS sample is not complete with respect to SZ selection. Finally, we find that the radio loud fraction of MaDCoWS clusters increases with richness, while we find no evidence that the submillimeter emission of the MaDCoWS clusters evolves with richness. Conclusions. We conclude that the original MaDCoWS selection function is not well defined and, as such, reiterate the MaDCoWS collaboration’s recommendation that the sample is suited for probing cluster and galaxy evolution, but not cosmological analyses. We find a best-fit mass-richness relation slope that agrees with the published MaDCoWS preliminary results. Additionally, we find that while the approximate level of infill of the ACT and MaDCoWS cluster SZ signals (1–2%) is subdominant to other sources of uncertainty for current generation experiments, characterizing and removing this bias will be critical for next-generation experiments hoping to constrain cluster masses at the sub-percent level. 
    more » « less