This content will become publicly available on March 1, 2023

Neural Networks as Optimal Estimators to Marginalize Over Baryonic Effects
Abstract Many different studies have shown that a wealth of cosmological information resides on small, nonlinear scales. Unfortunately, there are two challenges to overcome to utilize that information. First, we do not know the optimal estimator that will allow us to retrieve the maximum information. Second, baryonic effects impact that regime significantly and in a poorly understood manner. Ideally, we would like to use an estimator that extracts the maximum cosmological information while marginalizing over baryonic effects. In this work we show that neural networks can achieve that when considering some simple scenarios. We made use of data where the maximum amount of cosmological information is known: power spectra and 2D Gaussian density fields. We also contaminate the data with simplified baryonic effects and train neural networks to predict the value of the cosmological parameters. For this data, we show that neural networks can (1) extract the maximum available cosmological information, (2) marginalize over baryonic effects, and (3) extract cosmological information that is buried in the regime dominated by baryonic physics. We also show that neural networks learn the priors of the data they are trained on, affecting their extrapolation properties. We conclude that a promising strategy to maximize the more »
Authors:
; ; ; ; ; ;
Award ID(s):
Publication Date:
NSF-PAR ID:
10331327
Journal Name:
The Astrophysical Journal
Volume:
928
Issue:
1
Page Range or eLocation-ID:
44
ISSN:
0004-637X
National Science Foundation
##### More Like this
1. ABSTRACT

Ongoing and planned weak lensing (WL) surveys are becoming deep enough to contain information on angular scales down to a few arcmin. To fully extract information from these small scales, we must capture non-Gaussian features in the cosmological WL signal while accurately accounting for baryonic effects. In this work, we account for baryonic physics via a baryonic correction model that modifies the matter distribution in dark matter-only N-body simulations, mimicking the effects of galaxy formation and feedback. We implement this model in a large suite of ray-tracing simulations, spanning a grid of cosmological models in Ωm−σ8 space. We then develop a convolutional neural network (CNN) architecture to learn and constrain cosmological and baryonic parameters simultaneously from the simulated WL convergence maps. We find that in a Hyper-Suprime Cam-like survey, our CNN achieves a 1.7× tighter constraint in Ωm−σ8 space (1σ area) than the power spectrum and 2.1× tighter than the peak counts, showing that the CNN can efficiently extract non-Gaussian cosmological information even while marginalizing over baryonic effects. When we combine our CNN with the power spectrum, the baryonic effects degrade the constraint in Ωm−σ8 space by a factor of 2.4, compared to the much worse degradation by amore »

2. (Ed.)
Electroencephalography (EEG) is a popular clinical monitoring tool used for diagnosing brain-related disorders such as epilepsy [1]. As monitoring EEGs in a critical-care setting is an expensive and tedious task, there is a great interest in developing real-time EEG monitoring tools to improve patient care quality and efficiency [2]. However, clinicians require automatic seizure detection tools that provide decisions with at least 75% sensitivity and less than 1 false alarm (FA) per 24 hours [3]. Some commercial tools recently claim to reach such performance levels, including the Olympic Brainz Monitor [4] and Persyst 14 [5]. In this abstract, we describe our efforts to transform a high-performance offline seizure detection system [3] into a low latency real-time or online seizure detection system. An overview of the system is shown in Figure 1. The main difference between an online versus offline system is that an online system should always be causal and has minimum latency which is often defined by domain experts. The offline system, shown in Figure 2, uses two phases of deep learning models with postprocessing [3]. The channel-based long short term memory (LSTM) model (Phase 1 or P1) processes linear frequency cepstral coefficients (LFCC) [6] features from each EEGmore »
3. ; ; (Ed.)
The Neuronix high-performance computing cluster allows us to conduct extensive machine learning experiments on big data [1]. This heterogeneous cluster uses innovative scheduling technology, Slurm [2], that manages a network of CPUs and graphics processing units (GPUs). The GPU farm consists of a variety of processors ranging from low-end consumer grade devices such as the Nvidia GTX 970 to higher-end devices such as the GeForce RTX 2080. These GPUs are essential to our research since they allow extremely compute-intensive deep learning tasks to be executed on massive data resources such as the TUH EEG Corpus [2]. We use TensorFlow [3] as the core machine learning library for our deep learning systems, and routinely employ multiple GPUs to accelerate the training process. Reproducible results are essential to machine learning research. Reproducibility in this context means the ability to replicate an existing experiment – performance metrics such as error rates should be identical and floating-point calculations should match closely. Three examples of ways we typically expect an experiment to be replicable are: (1) The same job run on the same processor should produce the same results each time it is run. (2) A job run on a CPU and GPU should producemore »
4. Abstract

We study the use of deep learning techniques to reconstruct the kinematics of the neutral current deep inelastic scattering (DIS) process in electron–proton collisions. In particular, we use simulated data from the ZEUS experiment at the HERA accelerator facility, and train deep neural networks to reconstruct the kinematic variables$$Q^2$$${Q}^{2}$andx. Our approach is based on the information used in the classical construction methods, the measurements of the scattered lepton, and the hadronic final state in the detector, but is enhanced through correlations and patterns revealed with the simulated data sets. We show that, with the appropriate selection of a training set, the neural networks sufficiently surpass all classical reconstruction methods on most of the kinematic range considered. Rapid access to large samples of simulated data and the ability of neural networks to effectively extract information from large data sets, both suggest that deep learning techniques to reconstruct DIS kinematics can serve as a rigorous method to combine and outperform the classical reconstruction methods.

5. Multiview analysis aims to extract common information from data entities across different domains (e.g., acoustic, visual, text). Canonical correlation analysis (CCA) is one of the classic tools for this problem, which estimates the shared latent information via linear transforming the different views of data. CCA has also been generalized to the nonlinear regime, where kernel methods and neural networks are introduced to replace the linear transforms. While the theoretical aspects of linear CCA are relatively well understood, nonlinear multiview analysis is still largely intuition-driven. In this work, our interest lies in the identifiability of shared latent information under a nonlinear multiview analysis framework. We propose a model identification criterion for learning latent information from multiview data, under a reasonable data generating model. We show that minimizing this criterion leads to identification of the latent shared information up to certain indeterminacy. We also propose a neural network based implementation and an efficient algorithm to realize the criterion. Our analysis is backed by experiments on both synthetic and real data.