skip to main content

Title: Galaxy–Galaxy lensing in HSC: Validation tests and the impact of heterogeneous spectroscopic training sets

Although photometric redshifts (photo-z’s) are crucial ingredients for current and upcoming large-scale surveys, the high-quality spectroscopic redshifts currently available to train, validate, and test them are substantially non-representative in both magnitude and colour. We investigate the nature and structure of this bias by tracking how objects from a heterogeneous training sample contribute to photo-z predictions as a function of magnitude and colour, and illustrate that the underlying redshift distribution at fixed colour can evolve strongly as a function of magnitude. We then test the robustness of the galaxy–galaxy lensing signal in 120 deg2 of HSC–SSP DR1 data to spectroscopic completeness and photo-z biases, and find that their impacts are sub-dominant to current statistical uncertainties. Our methodology provides a framework to investigate how spectroscopic incompleteness can impact photo-z-based weak lensing predictions in future surveys such as LSST and WFIRST.

more » « less
Award ID(s):
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Monthly Notices of the Royal Astronomical Society
Page Range / eLocation ID:
p. 5658-5677
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    A reliable estimate of the redshift distributionn(z) is crucial for using weak gravitational lensing and large-scale structures of galaxy catalogs to study cosmology. Spectroscopic redshifts for the dim and numerous galaxies of next-generation weak-lensing surveys are expected to be unavailable, making photometric redshift (photo-z) probability density functions (PDFs) the next best alternative for comprehensively encapsulating the nontrivial systematics affecting photo-zpoint estimation. The established stacked estimator ofn(z) avoids reducing photo-zPDFs to point estimates but yields a systematically biased estimate ofn(z) that worsens with a decreasing signal-to-noise ratio, the very regime where photo-zPDFs are most necessary. We introduce Cosmological Hierarchical Inference with Probabilistic Photometric Redshifts (CHIPPR), a statistically rigorous probabilistic graphical model of redshift-dependent photometry that correctly propagates the redshift uncertainty information beyond the best-fit estimator ofn(z) produced by traditional procedures and is provably the only self-consistent way to recovern(z) from photo-zPDFs. We present thechipprprototype code, noting that the mathematically justifiable approach incurs computational cost. TheCHIPPRapproach is applicable to any one-point statistic of any random variable, provided the prior probability density used to produce the posteriors is explicitly known; if the prior is implicit, as may be the case for popular photo-ztechniques, then the resulting posterior PDFs cannot be used for scientific inference. We therefore recommend that the photo-zcommunity focus on developing methodologies that enable the recovery of photo-zlikelihoods with support over all redshifts, either directly or via a known prior probability density.

    more » « less
  2. Abstract The accurate estimation of photometric redshifts is crucial to many upcoming galaxy surveys, for example, the Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST). Almost all Rubin extragalactic and cosmological science requires accurate and precise calculation of photometric redshifts; many diverse approaches to this problem are currently in the process of being developed, validated, and tested. In this work, we use the photometric redshift code GPz to examine two realistically complex training set imperfections scenarios for machine learning based photometric redshift calculation: (i) where the spectroscopic training set has a very different distribution in color–magnitude space to the test set, and (ii) where the effect of emission line confusion causes a fraction of the training spectroscopic sample to not have the true redshift. By evaluating the sensitivity of GPz to a range of increasingly severe imperfections, with a range of metrics (both of photo- z point estimates as well as posterior probability distribution functions, PDFs), we quantify the degree to which predictions get worse with higher degrees of degradation. In particular, we find that there is a substantial drop-off in photo- z quality when line-confusion goes above ∼1%, and sample incompleteness below a redshift of 1.5, for an experimental setup using data from the Buzzard Flock synthetic sky catalogs. 
    more » « less
  3. null (Ed.)
    ABSTRACT Photometric galaxy surveys constitute a powerful cosmological probe but rely on the accurate characterization of their redshift distributions using only broad-band imaging, and can be very sensitive to incomplete or biased priors used for redshift calibration. A hierarchical Bayesian model has recently been developed to estimate those from the robust combination of prior information, photometry of single galaxies, and the information contained in the galaxy clustering against a well-characterized tracer population. In this work, we extend the method so that it can be applied to real data, developing some necessary new extensions to it, especially in the treatment of galaxy clustering information, and we test it on realistic simulations. After marginalizing over the mapping between the clustering estimator and the actual density distribution of the sample galaxies, and using prior information from a small patch of the survey, we find the incorporation of clustering information with photo-z’s tightens the redshift posteriors and overcomes biases in the prior that mimic those happening in spectroscopic samples. The method presented here uses all the information at hand to reduce prior biases and incompleteness. Even in cases where we artificially bias the spectroscopic sample to induce a shift in mean redshift of $\Delta \bar{z} \approx 0.05,$ the final biases in the posterior are $\Delta \bar{z} \lesssim 0.003.$ This robustness to flaws in the redshift prior or training samples would constitute a milestone for the control of redshift systematic uncertainties in future weak lensing analyses. 
    more » « less

    Radial colour gradients within galaxies arise from gradients of stellar age, metallicity, and dust reddening. Large samples of colour gradients from wide-area imaging surveys can complement smaller integral-field spectroscopy data sets and can be used to constrain galaxy formation models. Here, we measure colour gradients for low-redshift galaxies (z < 0.1) using photometry from the DESI Legacy Imaging Survey DR9. Our sample comprises ∼93 000 galaxies with spectroscopic redshifts and ∼574 000 galaxies with photometric redshifts. We focus on gradients across a radial range 0.5Reff to Reff, which corresponds to the inner disc of typical late-type systems at low redshift. This region has been the focus of previous statistical studies of colour gradients and has recently been explored by spectroscopic surveys such as MaNGA. We find that the colour gradients of most galaxies in our sample are negative (redder towards the centre), consistent with the literature. We investigate empirical relationships between colour gradient, average g − r and r − z colour, Mr, M⋆, and sSFR. Trends of gradient strength with Mr (M⋆) show an inflection around Mr ∼ −21 ($\log _{10} \, M_\star /\mathrm{M_\odot }\sim 10.5$). Below this mass, colour gradients become steeper with increasing M⋆, whereas colour gradients in more massive galaxies become shallower. We find that positive gradients (bluer stars at smaller radii) are typical for galaxies of $M_{\star }\sim 10^{8}\, \mathrm{M_\odot }$. We compare our results to age and metallicity gradients in two data sets derived from fits of different stellar population libraries to MaNGA spectra, but find no clear consensus explanation for the trends we observe. Both MaNGA data sets seem to imply a significant contribution from dust reddening, in particular, to explain the flatness of colour gradients along the red sequence.

    more » « less

    Knowing the redshift of galaxies is one of the first requirements of many cosmological experiments, and as it is impossible to perform spectroscopy for every galaxy being observed, photometric redshift (photo-z) estimations are still of particular interest. Here, we investigate different deep learning methods for obtaining photo-z estimates directly from images, comparing these with ‘traditional’ machine learning algorithms which make use of magnitudes retrieved through photometry. As well as testing a convolutional neural network (CNN) and inception-module CNN, we introduce a novel mixed-input model that allows for both images and magnitude data to be used in the same model as a way of further improving the estimated redshifts. We also perform benchmarking as a way of demonstrating the performance and scalability of the different algorithms. The data used in the study comes entirely from the Sloan Digital Sky Survey (SDSS) from which 1 million galaxies were used, each having 5-filtre (ugriz) images with complete photometry and a spectroscopic redshift which was taken as the ground truth. The mixed-input inception CNN achieved a mean squared error (MSE) =0.009, which was a significant improvement ($30{{\ \rm per\ cent}}$) over the traditional random forest (RF), and the model performed even better at lower redshifts achieving a MSE = 0.0007 (a $50{{\ \rm per\ cent}}$ improvement over the RF) in the range of z < 0.3. This method could be hugely beneficial to upcoming surveys, such as Euclid and the Vera C. Rubin Observatory’s Legacy Survey of Space and Time (LSST), which will require vast numbers of photo-z estimates produced as quickly and accurately as possible.

    more » « less