skip to main content


Title: Learning effective physical laws for generating cosmological hydrodynamics with Lagrangian deep learning

The goal of generative models is to learn the intricate relations between the data to create new simulated data, but current approaches fail in very high dimensions. When the true data-generating process is based on physical processes, these impose symmetries and constraints, and the generative model can be created by learning an effective description of the underlying physics, which enables scaling of the generative model to very high dimensions. In this work, we propose Lagrangian deep learning (LDL) for this purpose, applying it to learn outputs of cosmological hydrodynamical simulations. The model uses layers of Lagrangian displacements of particles describing the observables to learn the effective physical laws. The displacements are modeled as the gradient of an effective potential, which explicitly satisfies the translational and rotational invariance. The total number of learned parameters is only of order 10, and they can be viewed as effective theory parameters. We combine N-body solver fast particle mesh (FastPM) with LDL and apply it to a wide range of cosmological outputs, from the dark matter to the stellar maps, gas density, and temperature. The computational cost of LDL is nearly four orders of magnitude lower than that of the full hydrodynamical simulations, yet it outperforms them at the same resolution. We achieve this with only of order 10 layers from the initial conditions to the final output, in contrast to typical cosmological simulations with thousands of time steps. This opens up the possibility of analyzing cosmological observations entirely within this framework, without the need for large dark-matter simulations.

 
more » « less
Award ID(s):
1814370 1839217
NSF-PAR ID:
10221792
Author(s) / Creator(s):
;
Publisher / Repository:
Proceedings of the National Academy of Sciences
Date Published:
Journal Name:
Proceedings of the National Academy of Sciences
Volume:
118
Issue:
16
ISSN:
0027-8424
Page Range / eLocation ID:
Article No. e2020324118
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. ABSTRACT Hydrodynamic simulations provide a powerful, but computationally expensive, approach to study the interplay of dark matter and baryons in cosmological structure formation. Here, we introduce the EMulating Baryonic EnRichment (EMBER) Deep Learning framework to predict baryon fields based on dark matter-only simulations thereby reducing computational cost. EMBER comprises two network architectures, U-Net and Wasserstein Generative Adversarial Networks (WGANs), to predict 2D gas and H i densities from dark matter fields. We design the conditional WGANs as stochastic emulators, such that multiple target fields can be sampled from the same dark matter input. For training we combine cosmological volume and zoom-in hydrodynamical simulations from the Feedback in Realistic Environments (FIRE) project to represent a large range of scales. Our fiducial WGAN model reproduces the gas and H i power spectra within 10 per cent accuracy down to ∼10 kpc scales. Furthermore, we investigate the capability of EMBER to predict high resolution baryon fields from low resolution dark matter inputs through upsampling techniques. As a practical application, we use this methodology to emulate high-resolution H i maps for a dark matter simulation of a $L=100\, \text{Mpc}\, h^{ -1}$ comoving cosmological box. The gas content of dark matter haloes and the H i column density distributions predicted by EMBER agree well with results of large volume cosmological simulations and abundance matching models. Our method provides a computationally efficient, stochastic emulator for augmenting dark matter only simulations with physically consistent maps of baryon fields. 
    more » « less
  2. ABSTRACT

    Cosmological inference with large galaxy surveys requires theoretical models that combine precise predictions for large-scale structure with robust and flexible galaxy formation modelling throughout a sufficiently large cosmic volume. Here, we introduce the millenniumTNG (MTNG) project which combines the hydrodynamical galaxy formation model of illustrisTNG with the large volume of the millennium simulation. Our largest hydrodynamic simulation, covering $(500 \, h^{-1}{\rm Mpc})^3 \simeq (740\, {\rm Mpc})^3$, is complemented by a suite of dark-matter-only simulations with up to 43203 dark matter particles (a mass resolution of $1.32\times 10^8 \, h^{-1}{\rm M}_\odot$) using the fixed-and-paired technique to reduce large-scale cosmic variance. The hydro simulation adds 43203 gas cells, achieving a baryonic mass resolution of $2\times 10^7 \, h^{-1}{\rm M}_\odot$. High time-resolution merger trees and direct light-cone outputs facilitate the construction of a new generation of semi-analytic galaxy formation models that can be calibrated against both the hydro simulation and observation, and then applied to even larger volumes – MTNG includes a flagship simulation with 1.1 trillion dark matter particles and massive neutrinos in a volume of $(3000\, {\rm Mpc})^3$. In this introductory analysis we carry out convergence tests on basic measures of non-linear clustering such as the matter power spectrum, the halo mass function and halo clustering, and we compare simulation predictions to those from current cosmological emulators. We also use our simulations to study matter and halo statistics, such as halo bias and clustering at the baryonic acoustic oscillation scale. Finally we measure the impact of baryonic physics on the matter and halo distributions.

     
    more » « less
  3. Cosmological simulations of galaxy formation are limited by finite computational resources. We draw from the ongoing rapid advances in artificial intelligence (AI; specifically deep learning) to address this problem. Neural networks have been developed to learn from high-resolution (HR) image data and then make accurate superresolution (SR) versions of different low-resolution (LR) images. We apply such techniques to LR cosmological N-body simulations, generating SR versions. Specifically, we are able to enhance the simulation resolution by generating 512 times more particles and predicting their displacements from the initial positions. Therefore, our results can be viewed as simulation realizations themselves, rather than projections, e.g., to their density fields. Furthermore, the generation process is stochastic, enabling us to sample the small-scale modes conditioning on the large-scale environment. Our model learns from only 16 pairs of small-volume LR-HR simulations and is then able to generate SR simulations that successfully reproduce the HR matter power spectrum to percent level up to16h1Mpcand the HR halo mass function to within10%down to1011M. We successfully deploy the model in a box 1,000 times larger than the training simulation box, showing that high-resolution mock surveys can be generated rapidly. We conclude that AI assistance has the potential to revolutionize modeling of small-scale galaxy-formation physics in large cosmological volumes.

     
    more » « less
  4. ABSTRACT

    Next-generation weak lensing (WL) surveys, such as by the Vera Rubin Observatory, the Roman Space Telescope, and the Euclid space mission, will supply vast amounts of data probing small, highly non-linear scales. Extracting information from these scales requires higher-order statistics and the controlling of related systematics such as baryonic effects. To account for baryonic effects in cosmological analyses at reduced computational cost, semi-analytic baryonic correction models (BCMs) have been proposed. Here, we study the accuracy of a particular BCM (the A20-BCM) for WL peak counts, a well-studied, simple, and effective higher-order statistic. We compare WL peak counts generated from the full hydrodynamical simulation IllustrisTNG and a baryon-corrected version of the corresponding dark matter-only simulation IllustrisTNG-Dark. We apply galaxy shape noise matching depths reached by DES, KiDS, HSC, LSST, Roman, and Euclid. We find that peak counts from the A20-BCM are (i) accurate at per cent level for peaks with S/N < 4, (ii) statistically indistinguishable from IllustrisTNG in most current and ongoing surveys, but (iii) insufficient for deep future surveys covering the largest solid angles, such as LSST and Euclid. We find that the BCM matches individual peaks accurately, but underpredicts the amplitude of the highest peaks. We conclude that the A20-BCM is a viable substitute for full hydrodynamical simulations in cosmological parameter estimation from beyond-Gaussian statistics for ongoing and future surveys with modest solid angles. For the largest surveys, the A20-BCM must be refined to provide a more accurate match, especially to the highest peaks.

     
    more » « less
  5. ABSTRACT

    In order to prepare for the upcoming wide-field cosmological surveys, large simulations of the Universe with realistic galaxy populations are required. In particular, the tendency of galaxies to naturally align towards overdensities, an effect called intrinsic alignments (IA), can be a major source of systematics in the weak lensing analysis. As the details of galaxy formation and evolution relevant to IA cannot be simulated in practice on such volumes, we propose as an alternative a Deep Generative Model. This model is trained on the IllustrisTNG-100 simulation and is capable of sampling the orientations of a population of galaxies so as to recover the correct alignments. In our approach, we model the cosmic web as a set of graphs, where the graphs are constructed for each halo, and galaxy orientations as a signal on those graphs. The generative model is implemented on a Generative Adversarial Network architecture and uses specifically designed Graph-Convolutional Networks sensitive to the relative 3D positions of the vertices. Given (sub)halo masses and tidal fields, the model is able to learn and predict scalar features such as galaxy and dark matter subhalo shapes; and more importantly, vector features such as the 3D orientation of the major axis of the ellipsoid and the complex 2D ellipticities. For correlations of 3D orientations the model is in good quantitative agreement with the measured values from the simulation, except for at very small and transition scales. For correlations of 2D ellipticities, the model is in good quantitative agreement with the measured values from the simulation on all scales. Additionally, the model is able to capture the dependence of IA on mass, morphological type, and central/satellite type.

     
    more » « less