skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Friday, May 16 until 2:00 AM ET on Saturday, May 17 due to maintenance. We apologize for the inconvenience.


Title: Cosmology with One Galaxy?
Abstract Galaxies can be characterized by many internal properties such as stellar mass, gas metallicity, and star formation rate. We quantify the amount of cosmological and astrophysical information that the internal properties of individual galaxies and their host dark matter halos contain. We train neural networks using hundreds of thousands of galaxies from 2000 state-of-the-art hydrodynamic simulations with different cosmologies and astrophysical models of the CAMELS project to perform likelihood-free inference on the value of the cosmological and astrophysical parameters. We find that knowing the internal properties of a single galaxy allows our models to infer the value of Ω m , at fixed Ω b , with a ∼10% precision, while no constraint can be placed on σ 8 . Our results hold for any type of galaxy, central or satellite, massive or dwarf, at all considered redshifts, z ≤ 3, and they incorporate uncertainties in astrophysics as modeled in CAMELS. However, our models are not robust to changes in subgrid physics due to the large intrinsic differences the two considered models imprint on galaxy properties. We find that the stellar mass, stellar metallicity, and maximum circular velocity are among the most important galaxy properties to determine the value of Ω m . We believe that our results can be explained by considering that changes in the value of Ω m , or potentially Ω b /Ω m , affect the dark matter content of galaxies, which leaves a signature in galaxy properties distinct from the one induced by galactic processes. Our results suggest that the low-dimensional manifold hosting galaxy properties provides a tight direct link between cosmology and astrophysics.  more » « less
Award ID(s):
2108944
PAR ID:
10331324
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ;
Date Published:
Journal Name:
The Astrophysical Journal
Volume:
929
Issue:
2
ISSN:
0004-637X
Page Range / eLocation ID:
132
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Recent work has pointed out the potential existence of a tight relation between the cosmological parameter Ω m , at fixed Ω b , and the properties of individual galaxies in state-of-the-art cosmological hydrodynamic simulations. In this paper, we investigate whether such a relation also holds for galaxies from simulations run with a different code that makes use of a distinct subgrid physics: Astrid. We also find that in this case, neural networks are able to infer the value of Ω m with a ∼10% precision from the properties of individual galaxies, while accounting for astrophysics uncertainties, as modeled in Cosmology and Astrophysics with MachinE Learning (CAMELS). This tight relationship is present at all considered redshifts, z ≤ 3, and the stellar mass, the stellar metallicity, and the maximum circular velocity are among the most important galaxy properties behind the relation. In order to use this method with real galaxies, one needs to quantify its robustness: the accuracy of the model when tested on galaxies generated by codes different from the one used for training. We quantify the robustness of the models by testing them on galaxies from four different codes: IllustrisTNG, SIMBA, Astrid, and Magneticum. We show that the models perform well on a large fraction of the galaxies, but fail dramatically on a small fraction of them. Removing these outliers significantly improves the accuracy of the models across simulation codes. 
    more » « less
  2. Abstract As the next generation of large galaxy surveys come online, it is becoming increasingly important to develop and understand the machine-learning tools that analyze big astronomical data. Neural networks are powerful and capable of probing deep patterns in data, but they must be trained carefully on large and representative data sets. We present a new “hump” of the Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) project: CAMELS-SAM, encompassing one thousand dark-matter-only simulations of (100h−1cMpc)3with different cosmological parameters (Ωmandσ8) and run through the Santa Cruz semi-analytic model for galaxy formation over a broad range of astrophysical parameters. As a proof of concept for the power of this vast suite of simulated galaxies in a large volume and broad parameter space, we probe the power of simple clustering summary statistics to marginalize over astrophysics and constrain cosmology using neural networks. We use the two-point correlation, count-in-cells, and void probability functions, and we probe nonlinear and linear scales across 0.68 <R<27h−1cMpc. We find our neural networks can both marginalize over the uncertainties in astrophysics to constrain cosmology to 3%–8% error across various types of galaxy selections, while simultaneously learning about the SC-SAM astrophysical parameters. This work encompasses vital first steps toward creating algorithms able to marginalize over the uncertainties in our galaxy formation models and measure the underlying cosmology of our Universe. CAMELS-SAM has been publicly released alongside the rest of CAMELS, and it offers great potential to many applications of machine learning in astrophysics:https://camels-sam.readthedocs.io. 
    more » « less
  3. Abstract We present CAMELS-ASTRID, the third suite of hydrodynamical simulations in the Cosmology and Astrophysics with MachinE Learning (CAMELS) project, along with new simulation sets that extend the model parameter space based on the previous frameworks of CAMELS-TNG and CAMELS-SIMBA, to provide broader training sets and testing grounds for machine-learning algorithms designed for cosmological studies. CAMELS-ASTRID employs the galaxy formation model following the ASTRID simulation and contains 2124 hydrodynamic simulation runs that vary three cosmological parameters (Ωm8, Ωb) and four parameters controlling stellar and active galactic nucleus (AGN) feedback. Compared to the existing TNG and SIMBA simulation suites in CAMELS, the fiducial model of ASTRID features the mildest AGN feedback and predicts the least baryonic effect on the matter power spectrum. The training set of ASTRID covers a broader variation in the galaxy populations and the baryonic impact on the matter power spectrum compared to its TNG and SIMBA counterparts, which can make machine-learning models trained on the ASTRID suite exhibit better extrapolation performance when tested on other hydrodynamic simulation sets. We also introduce extension simulation sets in CAMELS that widely explore 28 parameters in the TNG and SIMBA models, demonstrating the enormity of the overall galaxy formation model parameter space and the complex nonlinear interplay between cosmology and astrophysical processes. With the new simulation suites, we show that building robust machine-learning models favors training and testing on the largest possible diversity of galaxy formation models. We also demonstrate that it is possible to train accurate neural networks to infer cosmological parameters using the high-dimensional TNG-SB28 simulation set. 
    more » « less
  4. Abstract Recent works have discovered a relatively tight correlation between Ωmand the properties of individual simulated galaxies. Because of this, it has been shown that constraints on Ωmcan be placed using the properties of individual galaxies while accounting for uncertainties in astrophysical processes such as feedback from supernovae and active galactic nuclei. In this work, we quantify whether using the properties of multiple galaxies simultaneously can tighten those constraints. For this, we train neural networks to perform likelihood-free inference on the value of two cosmological parameters (Ωmandσ8) and four astrophysical parameters using the properties of several galaxies from thousands of hydrodynamic simulations of the CAMELS project. We find that using properties of more than one galaxy increases the precision of the Ωminference. Furthermore, using multiple galaxies enables the inference of other parameters that were poorly constrained with one single galaxy. We show that the same subset of galaxy properties are responsible for the constraints on Ωmfrom one and multiple galaxies. Finally, we quantify the robustness of the model and find that without identifying the model range of validity, the model does not perform well when tested on galaxies from other galaxy formation models. 
    more » « less
  5. Abstract Galaxies are biased tracers of the underlying cosmic web, which is dominated by dark matter (DM) components that cannot be directly observed. Galaxy formation simulations can be used to study the relationship between DM density fields and galaxy distributions. However, this relationship can be sensitive to assumptions in cosmology and astrophysical processes embedded in galaxy formation models, which remain uncertain in many aspects. In this work, we develop a diffusion generative model to reconstruct DM fields from galaxies. The diffusion model is trained on the CAMELS simulation suite that contains thousands of state-of-the-art galaxy formation simulations with varying cosmological parameters and subgrid astrophysics. We demonstrate that the diffusion model can predict the unbiased posterior distribution of the underlying DM fields from the given stellar density fields while being able to marginalize over uncertainties in cosmological and astrophysical models. Interestingly, the model generalizes to simulation volumes ≈500 times larger than those it was trained on and across different galaxy formation models. The code for reproducing these results can be found athttps://github.com/victoriaono/variational-diffusion-cdm✎. 
    more » « less