- Award ID(s):
- 2009251
- NSF-PAR ID:
- 10332320
- Date Published:
- Journal Name:
- Advances in neural information processing systems
- ISSN:
- 1049-5258
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Many astrophysical analyses depend on estimates of redshifts (a proxy for distance) determined from photometric (i.e., imaging) data alone. Inaccurate estimates of photometric redshift uncertainties can result in large systematic errors. However, probability distribution outputs from many photometric redshift methods do not follow the frequentist definition of a Probability Density Function (PDF) for redshift — i.e., the fraction of times the true redshift falls between two limits z1 and z2 should be equal to the integral of the PDF between these limits. Previous works have used the global distribution of Probability Integral Transform (PIT) values to re-calibrate PDFs, but offsetting inaccuracies in different regions of feature space can conspire to limit the efficacy of the method. We leverage a recently developed regression technique that characterizes the local PIT distribution at any location in feature space to perform a local re-calibration of photometric redshift PDFs resulting in calibrated predictive distributions. Though we focus on an example from astrophysics, our method can produce predictive distributions which are calibrated at all locations in feature space for any use case.more » « less
-
Abstract The accurate estimation of photometric redshifts is crucial to many upcoming galaxy surveys, for example, the Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST). Almost all Rubin extragalactic and cosmological science requires accurate and precise calculation of photometric redshifts; many diverse approaches to this problem are currently in the process of being developed, validated, and tested. In this work, we use the photometric redshift code GPz to examine two realistically complex training set imperfections scenarios for machine learning based photometric redshift calculation: (i) where the spectroscopic training set has a very different distribution in color–magnitude space to the test set, and (ii) where the effect of emission line confusion causes a fraction of the training spectroscopic sample to not have the true redshift. By evaluating the sensitivity of GPz to a range of increasingly severe imperfections, with a range of metrics (both of photo- z point estimates as well as posterior probability distribution functions, PDFs), we quantify the degree to which predictions get worse with higher degrees of degradation. In particular, we find that there is a substantial drop-off in photo- z quality when line-confusion goes above ∼1%, and sample incompleteness below a redshift of 1.5, for an experimental setup using data from the Buzzard Flock synthetic sky catalogs.more » « less
-
Abstract A reliable estimate of the redshift distribution
n (z ) is crucial for using weak gravitational lensing and large-scale structures of galaxy catalogs to study cosmology. Spectroscopic redshifts for the dim and numerous galaxies of next-generation weak-lensing surveys are expected to be unavailable, making photometric redshift (photo-z ) probability density functions (PDFs) the next best alternative for comprehensively encapsulating the nontrivial systematics affecting photo-z point estimation. The established stacked estimator ofn (z ) avoids reducing photo-z PDFs to point estimates but yields a systematically biased estimate ofn (z ) that worsens with a decreasing signal-to-noise ratio, the very regime where photo-z PDFs are most necessary. We introduce Cosmological Hierarchical Inference with Probabilistic Photometric Redshifts (CHIPPR ), a statistically rigorous probabilistic graphical model of redshift-dependent photometry that correctly propagates the redshift uncertainty information beyond the best-fit estimator ofn (z ) produced by traditional procedures and is provably the only self-consistent way to recovern (z ) from photo-z PDFs. We present thechippr prototype code, noting that the mathematically justifiable approach incurs computational cost. TheCHIPPR approach is applicable to any one-point statistic of any random variable, provided the prior probability density used to produce the posteriors is explicitly known; if the prior is implicit, as may be the case for popular photo-z techniques, then the resulting posterior PDFs cannot be used for scientific inference. We therefore recommend that the photo-z community focus on developing methodologies that enable the recovery of photo-z likelihoods with support over all redshifts, either directly or via a known prior probability density. -
A method is presented for predicting the space group of a structure given a calculated or measured atomic pair distribution function (PDF) from that structure. The method utilizes machine learning models trained on more than 100 000 PDFs calculated from structures in the 45 most heavily represented space groups. In particular, a convolutional neural network (CNN) model is presented which yields a promising result in that it correctly identifies the space group among the top-6 estimates 91.9% of the time. The CNN model also successfully identifies space groups for 12 out of 15 experimental PDFs. Interesting aspects of the failed estimates are discussed, which indicate that the CNN is failing in similar ways as conventional indexing algorithms applied to conventional powder diffraction data. This preliminary success of the CNN model shows the possibility of model-independent assessment of PDF data on a wide class of materials.more » « less
-
ABSTRACT We present a method for mapping variations between probability distribution functions and apply this method within the context of measuring galaxy redshift distributions from imaging survey data. This method, which we name PITPZ for the probability integral transformations it relies on, uses a difference in curves between distribution functions in an ensemble as a transformation to apply to another distribution function, thus transferring the variation in the ensemble to the latter distribution function. This procedure is broadly applicable to the problem of uncertainty propagation. In the context of redshift distributions, for example, the uncertainty contribution due to certain effects can be studied effectively only in simulations, thus necessitating a transfer of variation measured in simulations to the redshift distributions measured from data. We illustrate the use of PITPZ by using the method to propagate photometric calibration uncertainty to redshift distributions of the Dark Energy Survey Year 3 weak lensing source galaxies. For this test case, we find that PITPZ yields a lensing amplitude uncertainty estimate due to photometric calibration error within 1 per cent of the truth, compared to as much as a 30 per cent underestimate when using traditional methods.