skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 10:00 PM to 12:00 PM ET on Tuesday, March 25 due to maintenance. We apologize for the inconvenience.


Search for: All records

Creators/Authors contains: "Villaescusa-Navarro, Francisco"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract Recent works have discovered a relatively tight correlation between Ωmand the properties of individual simulated galaxies. Because of this, it has been shown that constraints on Ωmcan be placed using the properties of individual galaxies while accounting for uncertainties in astrophysical processes such as feedback from supernovae and active galactic nuclei. In this work, we quantify whether using the properties of multiple galaxies simultaneously can tighten those constraints. For this, we train neural networks to perform likelihood-free inference on the value of two cosmological parameters (Ωmandσ8) and four astrophysical parameters using the properties of several galaxies from thousands of hydrodynamic simulations of the CAMELS project. We find that using properties of more than one galaxy increases the precision of the Ωminference. Furthermore, using multiple galaxies enables the inference of other parameters that were poorly constrained with one single galaxy. We show that the same subset of galaxy properties are responsible for the constraints on Ωmfrom one and multiple galaxies. Finally, we quantify the robustness of the model and find that without identifying the model range of validity, the model does not perform well when tested on galaxies from other galaxy formation models. 
    more » « less
    Free, publicly-accessible full text available July 1, 2025
  2. Abstract Galaxies are biased tracers of the underlying cosmic web, which is dominated by dark matter (DM) components that cannot be directly observed. Galaxy formation simulations can be used to study the relationship between DM density fields and galaxy distributions. However, this relationship can be sensitive to assumptions in cosmology and astrophysical processes embedded in galaxy formation models, which remain uncertain in many aspects. In this work, we develop a diffusion generative model to reconstruct DM fields from galaxies. The diffusion model is trained on the CAMELS simulation suite that contains thousands of state-of-the-art galaxy formation simulations with varying cosmological parameters and subgrid astrophysics. We demonstrate that the diffusion model can predict the unbiased posterior distribution of the underlying DM fields from the given stellar density fields while being able to marginalize over uncertainties in cosmological and astrophysical models. Interestingly, the model generalizes to simulation volumes ≈500 times larger than those it was trained on and across different galaxy formation models. The code for reproducing these results can be found athttps://github.com/victoriaono/variational-diffusion-cdm✎. 
    more » « less
  3. Abstract Most diffuse baryons, including the circumgalactic medium (CGM) surrounding galaxies and the intergalactic medium (IGM) in the cosmic web, remain unmeasured and unconstrained. Fast radio bursts (FRBs) offer an unparalleled method to measure the electron dispersion measures (DMs) of ionized baryons. Their distribution can resolve the missing baryon problem and constrain the history of feedback theorized to impart significant energy to the CGM and IGM. We analyze the Cosmology and Astrophysics with Machine Learning Simulations using three suites, IllustrisTNG, SIMBA, and Astrid, each varying six parameters (two cosmological and four astrophysical feedback), for a total of 183 distinct simulation models. We find significantly different predictions between the fiducial models of the suites owing to their different implementations of feedback. SIMBA exhibits the strongest feedback, leading to the smoothest distribution of baryons and reducing the sight-line-to-sight-line variance in DMs betweenz= 0 and 1. Astrid has the weakest feedback and the largest variance. We calculate FRB CGM measurements as a function of galaxy impact parameter, with SIMBA showing the weakest DMs due to aggressive active galactic nucleus (AGN) feedback and Astrid the strongest. Within each suite, the largest differences are due to varying AGN feedback. IllustrisTNG shows the most sensitivity to supernova feedback, but this is due to the change in the AGN feedback strengths, demonstrating that black holes, not stars, are most capable of redistributing baryons in the IGM and CGM. We compare our statistics directly to recent observations, paving the way for the use of FRBs to constrain the physics of galaxy formation and evolution. 
    more » « less
  4. ABSTRACT The circum-galactic medium (CGM) can feasibly be mapped by multiwavelength surveys covering broad swaths of the sky. With multiple large data sets becoming available in the near future, we develop a likelihood-free Deep Learning technique using convolutional neural networks (CNNs) to infer broad-scale physical properties of a galaxy’s CGM and its halo mass for the first time. Using CAMELS (Cosmology and Astrophysics with MachinE Learning Simulations) data, including IllustrisTNG, SIMBA, and Astrid models, we train CNNs on Soft X-ray and 21-cm (H i) radio two-dimensional maps to trace hot and cool gas, respectively, around galaxies, groups, and clusters. Our CNNs offer the unique ability to train and test on ‘multifield’ data sets comprised of both H i and X-ray maps, providing complementary information about physical CGM properties and improved inferences. Applying eRASS:4 survey limits shows that X-ray is not powerful enough to infer individual haloes with masses log (Mhalo/M⊙) < 12.5. The multifield improves the inference for all halo masses. Generally, the CNN trained and tested on Astrid (SIMBA) can most (least) accurately infer CGM properties. Cross-simulation analysis – training on one galaxy formation model and testing on another – highlights the challenges of developing CNNs trained on a single model to marginalize over astrophysical uncertainties and perform robust inferences on real data. The next crucial step in improving the resulting inferences on the physical properties of CGM depends on our ability to interpret these deep-learning models. 
    more » « less
  5. Abstract Galaxy formation models within cosmological hydrodynamical simulations contain numerous parameters with nontrivial influences over the resulting properties of simulated cosmic structures and galaxy populations. It is computationally challenging to sample these high dimensional parameter spaces with simulations, in particular for halos in the high-mass end of the mass function. In this work, we develop a novel sampling and reduced variance regression method,CARPoolGP, which leverages built-in correlations between samples in different locations of high dimensional parameter spaces to provide an efficient way to explore parameter space and generate low-variance emulations of summary statistics. We use this method to extend the Cosmology and Astrophysics with machinE Learning Simulations to include a set of 768 zoom-in simulations of halos in the mass range of 1013–1014.5Mh−1that span a 28-dimensional parameter space in the IllustrisTNG model. With these simulations and the CARPoolGP emulation method, we explore parameter trends in the ComptonY–M, black hole mass–halo mass, and metallicity–mass relations, as well as thermodynamic profiles and quenched fractions of satellite galaxies. We use these emulations to provide a physical picture of the complex interplay between supernova and active galactic nuclei feedback. We then use emulations of theY–Mrelation of massive halos to perform Fisher forecasts on astrophysical parameters for future Sunyaev–Zeldovich observations and find a significant improvement in forecasted constraints. We publicly release both the simulation suite and CARPoolGP software package. 
    more » « less
  6. ABSTRACT We quantify the cosmological spread of baryons relative to their initial neighbouring dark matter distribution using thousands of state-of-the-art simulations from the Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) project. We show that dark matter particles spread relative to their initial neighbouring distribution owing to chaotic gravitational dynamics on spatial scales comparable to their host dark matter halo. In contrast, gas in hydrodynamic simulations spreads much further from the initial neighbouring dark matter owing to feedback from supernovae (SNe) and active galactic nuclei (AGN). We show that large-scale baryon spread is very sensitive to model implementation details, with the fiducial simba model spreading ∼40 per cent of baryons >1 Mpc away compared to ∼10 per cent for the IllustrisTNG and astrid models. Increasing the efficiency of AGN-driven outflows greatly increases baryon spread while increasing the strength of SNe-driven winds can decrease spreading due to non-linear coupling of stellar and AGN feedback. We compare total matter power spectra between hydrodynamic and paired N-body simulations and demonstrate that the baryonic spread metric broadly captures the global impact of feedback on matter clustering over variations of cosmological and astrophysical parameters, initial conditions, and (to a lesser extent) galaxy formation models. Using symbolic regression, we find a function that reproduces the suppression of power by feedback as a function of wave number (k) and baryonic spread up to $$k \sim 10\, h$$ Mpc−1 in SIMBA while highlighting the challenge of developing models robust to variations in galaxy formation physics implementation. 
    more » « less
  7. Abstract It is well known that the power spectrum is not able to fully characterize the statistical properties of non-Gaussian density fields. Recently, many different statistics have been proposed to extract information from non-Gaussian cosmological fields that perform better than the power spectrum. The Fisher matrix formalism is commonly used to quantify the accuracy with which a given statistic can constrain the value of the cosmological parameters. However, these calculations typically rely on the assumption that the sampling distribution of the considered statistic follows a multivariate Gaussian distribution. In this work, we follow Sellentin & Heavens and use two different statistical tests to identify non-Gaussianities in different statistics such as the power spectrum, bispectrum, marked power spectrum, and wavelet scattering transform (WST). We remove the non-Gaussian components of the different statistics and perform Fisher matrix calculations with theGaussianizedstatistics using Quijote simulations. We show that constraints on the parameters can change by a factor of ∼2 in some cases. We show with simple examples how statistics that do not follow a multivariate Gaussian distribution can achieve artificially tight bounds on the cosmological parameters when using the Fisher matrix formalism. We think that the non-Gaussian tests used in this work represent a powerful tool to quantify the robustness of Fisher matrix calculations and their underlying assumptions. We release the code used to compute the power spectra, bispectra, and WST that can be run on both CPUs and GPUs. 
    more » « less
  8. ABSTRACT Extracting information from the total matter power spectrum with the precision needed for upcoming cosmological surveys requires unraveling the complex effects of galaxy formation processes on the distribution of matter. We investigate the impact of baryonic physics on matter clustering at z = 0 using a library of power spectra from the Cosmology and Astrophysics with MachinE Learning Simulations project, containing thousands of $$(25\, h^{-1}\, {\rm Mpc})^3$$ volume realizations with varying cosmology, initial random field, stellar and active galactic nucleus (AGN) feedback strength and subgrid model implementation methods. We show that baryonic physics affects matter clustering on scales $$k \gtrsim 0.4\, h\, \mathrm{Mpc}^{-1}$$ and the magnitude of this effect is dependent on the details of the galaxy formation implementation and variations of cosmological and astrophysical parameters. Increasing AGN feedback strength decreases halo baryon fractions and yields stronger suppression of power relative to N-body simulations, while stronger stellar feedback often results in weaker effects by suppressing black hole growth and therefore the impact of AGN feedback. We find a broad correlation between mean baryon fraction of massive haloes (M200c > 1013.5 M⊙) and suppression of matter clustering but with significant scatter compared to previous work owing to wider exploration of feedback parameters and cosmic variance effects. We show that a random forest regressor trained on the baryon content and abundance of haloes across the full mass range 1010 ≤ Mhalo/M⊙<1015 can predict the effect of galaxy formation on the matter power spectrum on scales k = 1.0–20.0 $$h\, \mathrm{Mpc}^{-1}$$. 
    more » « less
  9. ABSTRACT Forward-modeling observables from galaxy simulations enables direct comparisons between theory and observations. To generate synthetic spectral energy distributions (SEDs) that include dust absorption, re-emission, and scattering, Monte Carlo radiative transfer is often used in post-processing on a galaxy-by-galaxy basis. However, this is computationally expensive, especially if one wants to make predictions for suites of many cosmological simulations. To alleviate this computational burden, we have developed a radiative transfer emulator using an artificial neural network (ANN), ANNgelina, that can reliably predict SEDs of simulated galaxies using a small number of integrated properties of the simulated galaxies: star formation rate, stellar and dust masses, and mass-weighted metallicities of all star particles and of only star particles with age <10 Myr. Here, we present the methodology and quantify the accuracy of the predictions. We train the ANN on SEDs computed for galaxies from the IllustrisTNG project’s TNG50 cosmological magnetohydrodynamical simulation. ANNgelina is able to predict the SEDs of TNG50 galaxies in the ultraviolet (UV) to millimetre regime with a typical median absolute error of ∼7 per cent. The prediction error is the greatest in the UV, possibly due to the viewing-angle dependence being greatest in this wavelength regime. Our results demonstrate that our ANN-based emulator is a promising computationally inexpensive alternative for forward-modeling galaxy SEDs from cosmological simulations. 
    more » « less
  10. Abstract Recent work has pointed out the potential existence of a tight relation between the cosmological parameter Ω m , at fixed Ω b , and the properties of individual galaxies in state-of-the-art cosmological hydrodynamic simulations. In this paper, we investigate whether such a relation also holds for galaxies from simulations run with a different code that makes use of a distinct subgrid physics: Astrid. We also find that in this case, neural networks are able to infer the value of Ω m with a ∼10% precision from the properties of individual galaxies, while accounting for astrophysics uncertainties, as modeled in Cosmology and Astrophysics with MachinE Learning (CAMELS). This tight relationship is present at all considered redshifts, z ≤ 3, and the stellar mass, the stellar metallicity, and the maximum circular velocity are among the most important galaxy properties behind the relation. In order to use this method with real galaxies, one needs to quantify its robustness: the accuracy of the model when tested on galaxies generated by codes different from the one used for training. We quantify the robustness of the models by testing them on galaxies from four different codes: IllustrisTNG, SIMBA, Astrid, and Magneticum. We show that the models perform well on a large fraction of the galaxies, but fail dramatically on a small fraction of them. Removing these outliers significantly improves the accuracy of the models across simulation codes. 
    more » « less