skip to main content


The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Thursday, February 13 until 2:00 AM ET on Friday, February 14 due to maintenance. We apologize for the inconvenience.

This content will become publicly available on June 1, 2025

Title: Dark Energy Survey Deep Field photometric redshift performance and training incompleteness assessment

Context.The determination of accurate photometric redshifts (photo-zs) in large imaging galaxy surveys is key for cosmological studies. One of the most common approaches is machine learning techniques. These methods require a spectroscopic or reference sample to train the algorithms. Attention has to be paid to the quality and properties of these samples since they are key factors in the estimation of reliable photo-zs.

Aims.The goal of this work is to calculate the photo-zsfor the Year 3 (Y3) Dark Energy Survey (DES) Deep Fields catalogue using the Directional Neighborhood Fitting (DNF) machine learning algorithm. Moreover, we want to develop techniques to assess the incompleteness of the training sample and metrics to study how incompleteness affects the quality of photometric redshifts. Finally, we are interested in comparing the performance obtained by DNF on the Y3 DES Deep Fields catalogue with that of the EAzY template fitting approach.

Methods.We emulated – at a brighter magnitude – the training incompleteness with a spectroscopic sample whose redshifts are known to have a measurable view of the problem. We used a principal component analysis to graphically assess the incompleteness and relate it with the performance parameters provided by DNF. Finally, we applied the results on the incompleteness to the photo-zcomputation on the Y3 DES Deep Fields with DNF and estimated its performance.

Results.The photo-zsof the galaxies in the DES deep fields were computed with the DNF algorithm and added to the Y3 DES Deep Fields catalogue. We have developed some techniques to evaluate the performance in the absence of “true” redshift and to assess the completeness. We have studied the tradeoff in the training sample between the highest spectroscopic redshift quality versus completeness. We found some advantages in relaxing the highest-quality spectroscopic redshift requirements at fainter magnitudes in favour of completeness. The results achieved by DNF on the Y3 Deep Fields are competitive with the ones provided by EAzY, showing notable stability at high redshifts. It should be noted that the good results obtained by DNF in the estimation of photo-zsin deep field catalogues make DNF suitable for the future Legacy Survey of Space and Time (LSST) andEucliddata, which will have similar depths to the Y3 DES Deep Fields.

more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more » ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; « less
Corporate Creator(s):
Publisher / Repository:
EDP Sciences
Date Published:
Journal Name:
Astronomy & Astrophysics
Page Range / eLocation ID:
Medium: X
Sponsoring Org:
National Science Foundation
More Like this

    Although photometric redshifts (photo-z’s) are crucial ingredients for current and upcoming large-scale surveys, the high-quality spectroscopic redshifts currently available to train, validate, and test them are substantially non-representative in both magnitude and colour. We investigate the nature and structure of this bias by tracking how objects from a heterogeneous training sample contribute to photo-z predictions as a function of magnitude and colour, and illustrate that the underlying redshift distribution at fixed colour can evolve strongly as a function of magnitude. We then test the robustness of the galaxy–galaxy lensing signal in 120 deg2 of HSC–SSP DR1 data to spectroscopic completeness and photo-z biases, and find that their impacts are sub-dominant to current statistical uncertainties. Our methodology provides a framework to investigate how spectroscopic incompleteness can impact photo-z-based weak lensing predictions in future surveys such as LSST and WFIRST.

    more » « less
  2. Abstract

    Large imaging surveys will rely on photometric redshifts (photo-z's), which are typically estimated through machine-learning methods. Currently planned spectroscopic surveys will not be deep enough to produce a representative training sample for Legacy Survey of Space and Time (LSST), so we seek methods to improve the photo-zestimates that arise from nonrepresentative training samples. Spectroscopic training samples for photo-z's are biased toward redder, brighter galaxies, which also tend to be at lower redshift than the typical galaxy observed by LSST, leading to poor photo-zestimates with outlier fractions nearly 4 times larger than for a representative training sample. In this Letter, we apply the concept of training sample augmentation, where we augment simulated nonrepresentative training samples with simulated galaxies possessing otherwise unrepresented features. When we select simulated galaxies with (g-z) color,i-band magnitude, and redshift outside the range of the original training sample, we are able to reduce the outlier fraction of the photo-zestimates for simulated LSST data by nearly 50% and the normalized median absolute deviation (NMAD) by 56%. When compared to a fully representative training sample, augmentation can recover nearly 70% of the degradation in the outlier fraction and 80% of the degradation in NMAD. Training sample augmentation is a simple and effective way to improve training samples for photo-z's without requiring additional spectroscopic samples.

    more » « less
  3. Abstract The accurate estimation of photometric redshifts is crucial to many upcoming galaxy surveys, for example, the Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST). Almost all Rubin extragalactic and cosmological science requires accurate and precise calculation of photometric redshifts; many diverse approaches to this problem are currently in the process of being developed, validated, and tested. In this work, we use the photometric redshift code GPz to examine two realistically complex training set imperfections scenarios for machine learning based photometric redshift calculation: (i) where the spectroscopic training set has a very different distribution in color–magnitude space to the test set, and (ii) where the effect of emission line confusion causes a fraction of the training spectroscopic sample to not have the true redshift. By evaluating the sensitivity of GPz to a range of increasingly severe imperfections, with a range of metrics (both of photo- z point estimates as well as posterior probability distribution functions, PDFs), we quantify the degree to which predictions get worse with higher degrees of degradation. In particular, we find that there is a substantial drop-off in photo- z quality when line-confusion goes above ∼1%, and sample incompleteness below a redshift of 1.5, for an experimental setup using data from the Buzzard Flock synthetic sky catalogs. 
    more » « less

    We present an alternative calibration of the MagLim lens sample redshift distributions from the Dark Energy Survey (DES) first 3 yr of data (Y3). The new calibration is based on a combination of a self-organizing-map-based scheme and clustering redshifts to estimate redshift distributions and inherent uncertainties, which is expected to be more accurate than the original DES Y3 redshift calibration of the lens sample. We describe in detail the methodology, and validate it on simulations and discuss the main effects dominating our error budget. The new calibration is in fair agreement with the fiducial DES Y3 n(z) calibration, with only mild differences (<3σ) in the means and widths of the distributions. We study the impact of this new calibration on cosmological constraints, analysing DES Y3 galaxy clustering and galaxy–galaxy lensing measurements, assuming a Lambda cold dark matter cosmology. We obtain Ωm = 0.30 ± 0.04, σ8 = 0.81 ± 0.07, and S8 = 0.81 ± 0.04, which implies a ∼0.4σ shift in the Ω − S8 plane compared to the fiducial DES Y3 results, highlighting the importance of the redshift calibration of the lens sample in multiprobe cosmological analyses.

    more » « less
  5. Abstract

    We present a catalog of 1.4 million photometrically selected quasar candidates in the southern hemisphere over the ∼5000 deg2Dark Energy Survey (DES) wide survey area. We combine optical photometry from the DES second data release (DR2) with available near-infrared (NIR) and the all-sky unWISE mid-infrared photometry in the selection. We build models of quasars, galaxies, and stars with multivariate skew-tdistributions in the multidimensional space of relative fluxes as functions of redshift (or color for stars) and magnitude. Our selection algorithm assigns probabilities for quasars, galaxies, and stars and simultaneously calculates photometric redshifts (photo-z) for quasar and galaxy candidates. Benchmarking on spectroscopically confirmed objects, we successfully classify (with photometry) 94.7% of quasars, 99.3% of galaxies, and 96.3% of stars when all IR bands (NIRYJHKand WISE W1W2) are available. The classification and photo-zregression success rates decrease when fewer bands are available. Our quasar (galaxy) photo-zquality, defined as the fraction of objects with the difference between the photo-z zpand the spectroscopic redshiftzs, ∣Δz∣ ≡ ∣zszp∣/(1 +zs) ≤ 0.1, is 92.2% (98.1%) when all IR bands are available, decreasing to 72.2% (90.0%) using optical DES data only. Our photometric quasar catalog achieves an estimated completeness of 89% and purity of 79% atr< 21.5 (0.68 million quasar candidates), with reduced completeness and purity at 21.5 <r≲ 24. Among the 1.4 million quasar candidates, 87,857 have existing spectra, and 84,978 (96.7%) of them are spectroscopically confirmed quasars. Finally, we provide quasar, galaxy, and star probabilities for all (0.69 billion) photometric sources in the DES DR2 coadded photometric catalog.

    more » « less