skip to main content

Title: Translation and rotation equivariant normalizing flow (TRENF) for optimal cosmological analysis

Our Universe is homogeneous and isotropic, and its perturbations obey translation and rotation symmetry. In this work, we develop translation and rotation equivariant normalizing flow (TRENF), a generative normalizing flow (NF) model which explicitly incorporates these symmetries, defining the data likelihood via a sequence of Fourier space-based convolutions and pixel-wise non-linear transforms. TRENF gives direct access to the high dimensional data likelihood p(x|y) as a function of the labels y, such as cosmological parameters. In contrast to traditional analyses based on summary statistics, the NF approach has no loss of information since it preserves the full dimensionality of the data. On Gaussian random fields, the TRENF likelihood agrees well with the analytical expression and saturates the Fisher information content in the labels y. On non-linear cosmological overdensity fields from N-body simulations, TRENF leads to significant improvements in constraining power over the standard power spectrum summary statistic. TRENF is also a generative model of the data, and we show that TRENF samples agree well with the N-body simulations it trained on, and that the inverse mapping of the data agrees well with a Gaussian white noise both visually and on various summary statistics: when this is perfectly achieved the resulting p(x|y) likelihood analysis becomes optimal. Finally, we develop a generalization of this model that can handle effects that break the symmetry of the data, such as the survey mask, which enables likelihood analysis on data without periodic boundaries.

more » « less
Award ID(s):
1839217 1814370
Author(s) / Creator(s):
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Monthly Notices of the Royal Astronomical Society
Page Range / eLocation ID:
p. 2363-2373
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract A wealth of cosmological and astrophysical information is expected from many ongoing and upcoming large-scale surveys. It is crucial to prepare for these surveys now and develop tools that can efficiently extract most information. We present HIF low : a fast generative model of the neutral hydrogen (H i ) maps that is conditioned only on cosmology (Ω m and σ 8 ) and designed using a class of normalizing flow models, the masked autoregressive flow. HIF low is trained on the state-of-the-art simulations from the Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) project. HIF low has the ability to generate realistic diverse maps without explicitly incorporating the expected two-dimensional maps structure into the flow as an inductive bias. We find that HIF low is able to reproduce the CAMELS average and standard deviation H i power spectrum within a factor of ≲2, scoring a very high R 2 > 90%. By inverting the flow, HIF low provides a tractable high-dimensional likelihood for efficient parameter inference. We show that the conditional HIF low on cosmology is successfully able to marginalize over astrophysics at the field level, regardless of the stellar and AGN feedback strengths. This new tool represents a first step toward a more powerful parameter inference, maximizing the scientific return of future H i surveys, and opening a new avenue to minimize the loss of complex information due to data compression down to summary statistics. 
    more » « less

    We present cosmological constraints from the Subaru Hyper Suprime-Cam (HSC) first-year weak lensing shear catalogue using convolutional neural networks (CNNs) and conventional summary statistics. We crop 19 $3\times 3\, \mathrm{{deg}^2}$ sub-fields from the first-year area, divide the galaxies with redshift 0.3 ≤ z ≤ 1.5 into four equally spaced redshift bins, and perform tomographic analyses. We develop a pipeline to generate simulated convergence maps from cosmological N-body simulations, where we account for effects such as intrinsic alignments (IAs), baryons, photometric redshift errors, and point spread function errors, to match characteristics of the real catalogue. We train CNNs that can predict the underlying parameters from the simulated maps, and we use them to construct likelihood functions for Bayesian analyses. In the Λ cold dark matter model with two free cosmological parameters Ωm and σ8, we find $\Omega _\mathrm{m}=0.278_{-0.035}^{+0.037}$, $S_8\equiv (\Omega _\mathrm{m}/0.3)^{0.5}\sigma _{8}=0.793_{-0.018}^{+0.017}$, and the IA amplitude $A_\mathrm{IA}=0.20_{-0.58}^{+0.55}$. In a model with four additional free baryonic parameters, we find $\Omega _\mathrm{m}=0.268_{-0.036}^{+0.040}$, $S_8=0.819_{-0.024}^{+0.034}$, and $A_\mathrm{IA}=-0.16_{-0.58}^{+0.59}$, with the baryonic parameters not being well-constrained. We also find that statistical uncertainties of the parameters by the CNNs are smaller than those from the power spectrum (5–24 per cent smaller for S8 and a factor of 2.5–3.0 smaller for Ωm), showing the effectiveness of CNNs for uncovering additional cosmological information from the HSC data. With baryons, the S8 discrepancy between HSC first-year data and Planck 2018 is reduced from $\sim 2.2\, \sigma$ to $0.3\!-\!0.5\, \sigma$.

    more » « less
  3. Batch Normalization (BN) is essential to effectively train state-of-the-art deep Convolutional Neural Networks (CNN). It normalizes the layer outputs during training using the statistics of each mini-batch. BN accelerates training procedure by allowing to safely utilize large learning rates and alleviates the need for careful initialization of the parameters. In this work, we study BN from the viewpoint of Fisher kernels that arise from generative probability models. We show that assuming samples within a mini-batch are from the same probability density function, then BN is identical to the Fisher vector of a Gaussian distribution. That means batch normalizing transform can be explained in terms of kernels that naturally emerge from the probability density function that models the generative process of the underlying data distribution. Consequently, it promises higher discrimination power for the batch-normalized mini-batch. However, given the rectifying non-linearities employed in CNN architectures, distribution of the layer outputs show an asymmetric characteristic. Therefore, in order for BN to fully benefit from the aforementioned properties, we propose approximating underlying data distribution not with one, but a mixture of Gaussian densities. Deriving Fisher vector for a Gaussian Mixture Model (GMM), reveals that batch normalization can be improved by independently normalizing with respect to the statistics of disentangled sub-populations. We refer to our proposed soft piecewise version of batch normalization as Mixture Normalization (MN). Through extensive set of experiments on CIFAR-10 and CIFAR-100, using both a 5-layers deep CNN and modern Inception-V3 architecture, we show that mixture normalization reduces required number of gradient updates to reach the maximum test accuracy of the batch normalized model by ∼31%-47% across a variety of training scenarios. Replacing even a few BN modules with MN in the 48-layers deep Inception-V3 architecture is sufficient to not only obtain considerable training acceleration but also better final test accuracy. We show that similar observations are valid for 40 and 100-layers deep DenseNet architectures as well. We complement our study by evaluating the application of mixture normalization to the Generative Adversarial Networks (GANs), where "mode collapse" hinders the training process. We solely replace a few batch normalization layers in the generator with our proposed mixture normalization. Our experiments using Deep Convolutional GAN (DCGAN) on CIFAR-10 show that mixture normalized DCGAN not only provides an acceleration of ∼58% but also reaches lower (better) "Fréchet Inception Distance" (FID) of 33.35 compared to 37.56 of its batch normalized counterpart. 
    more » « less

    We present cosmological constraints derived from peak counts, minimum counts, and the angular power spectrum of the Subaru Hyper Suprime-Cam first-year (HSC Y1) weak lensing shear catalogue. Weak lensing peak and minimum counts contain non-Gaussian information and hence are complementary to the conventional two-point statistics in constraining cosmology. In this work, we forward-model the three summary statistics and their dependence on cosmology, using a suite of N-body simulations tailored to the HSC Y1 data. We investigate systematic and astrophysical effects including intrinsic alignments, baryon feedback, multiplicative bias, and photometric redshift uncertainties. We mitigate the impact of these systematics by applying cuts on angular scales, smoothing scales, signal-to-noise ratio bins, and tomographic redshift bins. By combining peaks, minima, and the power spectrum, assuming a flat-ΛCDM model, we obtain $S_{8} \equiv \sigma _8\sqrt{\Omega _m/0.3}= 0.810^{+0.022}_{-0.026}$, a 35 per cent tighter constraint than that obtained from the angular power spectrum alone. Our results are in agreement with other studies using HSC weak lensing shear data, as well as with Planck 2018 cosmology and recent CMB lensing constraints from the Atacama Cosmology Telescope and the South Pole Telescope.

    more » « less
  5. Abstract

    In high energy physics, one of the most important processes for collider data analysis is the comparison of collected and simulated data. Nowadays the state-of-the-art for data generation is in the form of Monte Carlo (MC) generators. However, because of the upcoming high-luminosity upgrade of the Large Hadron Collider (LHC), there will not be enough computational power or time to match the amount of needed simulated data using MC methods. An alternative approach under study is the usage of machine learning generative methods to fulfill that task. Since the most common final-state objects of high-energy proton collisions are hadronic jets, which are collections of particles collimated in a given region of space, this work aims to develop a convolutional variational autoencoder (ConVAE) for the generation of particle-based LHC hadronic jets. Given the ConVAE’s limitations, a normalizing flow (NF) network is coupled to it in a two-step training process, which shows improvements on the results for the generated jets. The ConVAE+NF network is capable of generating a jet in18.30±0.04μs, making it one of the fastest methods for this task up to now.

    more » « less