skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Debiasing with Diffusion: Probabilistic Reconstruction of Dark Matter Fields from Galaxies with CAMELS
Abstract Galaxies are biased tracers of the underlying cosmic web, which is dominated by dark matter (DM) components that cannot be directly observed. Galaxy formation simulations can be used to study the relationship between DM density fields and galaxy distributions. However, this relationship can be sensitive to assumptions in cosmology and astrophysical processes embedded in galaxy formation models, which remain uncertain in many aspects. In this work, we develop a diffusion generative model to reconstruct DM fields from galaxies. The diffusion model is trained on the CAMELS simulation suite that contains thousands of state-of-the-art galaxy formation simulations with varying cosmological parameters and subgrid astrophysics. We demonstrate that the diffusion model can predict the unbiased posterior distribution of the underlying DM fields from the given stellar density fields while being able to marginalize over uncertainties in cosmological and astrophysical models. Interestingly, the model generalizes to simulation volumes ≈500 times larger than those it was trained on and across different galaxy formation models. The code for reproducing these results can be found athttps://github.com/victoriaono/variational-diffusion-cdm✎.  more » « less
Award ID(s):
2019786
PAR ID:
10528355
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
DOI PREFIX: 10.3847
Date Published:
Journal Name:
The Astrophysical Journal
Volume:
970
Issue:
2
ISSN:
0004-637X
Format(s):
Medium: X Size: Article No. 174
Size(s):
Article No. 174
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract As the next generation of large galaxy surveys come online, it is becoming increasingly important to develop and understand the machine-learning tools that analyze big astronomical data. Neural networks are powerful and capable of probing deep patterns in data, but they must be trained carefully on large and representative data sets. We present a new “hump” of the Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) project: CAMELS-SAM, encompassing one thousand dark-matter-only simulations of (100h−1cMpc)3with different cosmological parameters (Ωmandσ8) and run through the Santa Cruz semi-analytic model for galaxy formation over a broad range of astrophysical parameters. As a proof of concept for the power of this vast suite of simulated galaxies in a large volume and broad parameter space, we probe the power of simple clustering summary statistics to marginalize over astrophysics and constrain cosmology using neural networks. We use the two-point correlation, count-in-cells, and void probability functions, and we probe nonlinear and linear scales across 0.68 <R<27h−1cMpc. We find our neural networks can both marginalize over the uncertainties in astrophysics to constrain cosmology to 3%–8% error across various types of galaxy selections, while simultaneously learning about the SC-SAM astrophysical parameters. This work encompasses vital first steps toward creating algorithms able to marginalize over the uncertainties in our galaxy formation models and measure the underlying cosmology of our Universe. CAMELS-SAM has been publicly released alongside the rest of CAMELS, and it offers great potential to many applications of machine learning in astrophysics:https://camels-sam.readthedocs.io. 
    more » « less
  2. Abstract Precise and accurate predictions of the halo mass function for cluster mass scales inwνCDM cosmologies are crucial for extracting robust and unbiased cosmological information from upcoming galaxy cluster surveys.Here, we present a halo mass function emulator for cluster mass scales (≳ 1013M/h) up to redshiftz= 2 with comprehensive support for the parameter space ofwνCDM cosmologies allowed by current data.Based on theAemulusνsuite of simulations, the emulator marks a significant improvement in the precision of halo mass function predictions by incorporating both massive neutrinos and non-standard dark energy equation of state models.This allows for accurate modeling of the cosmology dependence in large-scale structure and galaxy cluster studies.We show that the emulator, designed using Gaussian Process Regression, has negligible theoretical uncertainties compared to dominant sources of error in future cluster abundance studies.Our emulator is publicly available (https://github.com/DelonShen/aemulusnu_hmf), providing the community with a crucial tool for upcoming cosmological surveys such as LSST and Euclid. 
    more » « less
  3. Abstract Gravitational waves (GWs) from merging compact objects encode direct information about the luminosity distance to the binary. When paired with a redshift measurement, this enables standard-siren cosmology: a Hubble diagram can be constructed to directly probe the Universe’s expansion. This can be done in the absence of electromagnetic measurements, as features in the mass distribution of GW sources provide self-calibrating redshift measurements without the need for a definite or probabilistic host galaxy association. This “spectral siren” technique has thus far only been applied with simple parametric representations of the mass distribution, and theoretical predictions for features in the mass distribution are commonly presumed to be fundamental to the measurement. However, the use of an inaccurate representation leads to biases in the cosmological inference, an acute problem given the current uncertainties in true source population. Furthermore, it is commonly presumed that the form of the mass distribution must be known a priori to obtain unbiased measurements of cosmological parameters in this fashion. Here, we demonstrate that spectral sirens can accurately infer cosmological parameters without such prior assumptions. We apply a flexible, nonparametric model for the mass distribution of compact binaries to a simulated catalog of 1000 GW signals, consistent with expectations for the next LIGO–Virgo–KAGRA observing run. We find that, despite our model’s flexibility, both the source mass model and cosmological parameters are correctly reconstructed. We predict a 11.2%✎measurement ofH0, keeping all other cosmological parameters fixed, and a 6.4%✎measurement ofH(z= 0.9)✎when fitting for multiple cosmological parameters (1σuncertainties). This astrophysically agnostic spectral siren technique will be essential to arrive at precise and unbiased cosmological constraints from GW source populations. 
    more » « less
  4. Abstract While space-borne optical and near-infrared facilities have succeeded in delivering a precise and spatially resolved picture of our Universe, their small survey area is known to underrepresent the true diversity of galaxy populations. Ground-based surveys have reached comparable depths but at lower spatial resolution, resulting in source confusion that hampers accurate photometry extractions. What once was limited to the infrared regime has now begun to challenge ground-based ultradeep surveys, affecting detection and photometry alike. Failing to address these challenges will mean forfeiting a representative view into the distant Universe. We introduceThe Farmer: an automated, reproducible profile-fitting photometry package that pairs a library of smooth parametric models fromThe Tractorwith a decision tree that determines the best-fit model in concert with neighboring sources. Photometry is measured by fitting the models on other bands leaving brightness free to vary. The resulting photometric measurements are naturally total, and no aperture corrections are required. Supporting diagnostics (e.g.,χ2) enable measurement validation. As fitting models is relatively time intensive,The Farmeris built with high-performance computing routines. We benchmarkThe Farmeron a set of realistic COSMOS-like images and find accurate photometry, number counts, and galaxy shapes.The Farmeris already being utilized to produce catalogs for several large-area deep extragalactic surveys where it has been shown to tackle some of the most challenging optical and near-infrared data available, with the promise of extending to other ultradeep surveys expected in the near future.The Farmeris available to download from GitHub (https://github.com/astroweaver/the_farmer) and Zenodo (https://doi.org/10.5281/zenodo.8205817). 
    more » « less
  5. Abstract New observational facilities are probing astrophysical transients such as stellar explosions and gravitational-wave sources at ever-increasing redshifts, while also revealing new features in source property distributions. To interpret these observations, we need to compare them to predictions from stellar population models. Such models require the metallicity-dependent cosmic star formation history ( ( Z , z ) ) as an input. Large uncertainties remain in the shape and evolution of this function. In this work, we propose a simple analytical function for ( Z , z ) . Variations of this function can be easily interpreted because the parameters link to its shape in an intuitive way. We fit our analytical function to the star-forming gas of the cosmological TNG100 simulation and find that it is able to capture the main behavior well. As an example application, we investigate the effect of systematic variations in the ( Z , z ) parameters on the predicted mass distribution of locally merging binary black holes. Our main findings are that (i) the locations of features are remarkably robust against variations in the metallicity-dependent cosmic star formation history, and (ii) the low-mass end is least affected by these variations. This is promising as it increases our chances of constraining the physics that govern the formation of these objects (https://github.com/LiekeVanSon/SFRD_fit/tree/7348a1ad0d2ed6b78c70d5100fb3cd2515493f02/). 
    more » « less