skip to main content

Title: Deep learning methods for obtaining photometric redshift estimations from images
ABSTRACT

Knowing the redshift of galaxies is one of the first requirements of many cosmological experiments, and as it is impossible to perform spectroscopy for every galaxy being observed, photometric redshift (photo-z) estimations are still of particular interest. Here, we investigate different deep learning methods for obtaining photo-z estimates directly from images, comparing these with ‘traditional’ machine learning algorithms which make use of magnitudes retrieved through photometry. As well as testing a convolutional neural network (CNN) and inception-module CNN, we introduce a novel mixed-input model that allows for both images and magnitude data to be used in the same model as a way of further improving the estimated redshifts. We also perform benchmarking as a way of demonstrating the performance and scalability of the different algorithms. The data used in the study comes entirely from the Sloan Digital Sky Survey (SDSS) from which 1 million galaxies were used, each having 5-filtre (ugriz) images with complete photometry and a spectroscopic redshift which was taken as the ground truth. The mixed-input inception CNN achieved a mean squared error (MSE) =0.009, which was a significant improvement ($30{{\ \rm per\ cent}}$) over the traditional random forest (RF), and the model performed even better at more » lower redshifts achieving a MSE = 0.0007 (a $50{{\ \rm per\ cent}}$ improvement over the RF) in the range of z < 0.3. This method could be hugely beneficial to upcoming surveys, such as Euclid and the Vera C. Rubin Observatory’s Legacy Survey of Space and Time (LSST), which will require vast numbers of photo-z estimates produced as quickly and accurately as possible.

« less
Authors:
; ; ; ;
Publication Date:
NSF-PAR ID:
10364226
Journal Name:
Monthly Notices of the Royal Astronomical Society
Volume:
512
Issue:
2
Page Range or eLocation-ID:
p. 1696-1709
ISSN:
0035-8711
Publisher:
Oxford University Press
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    A reliable estimate of the redshift distributionn(z) is crucial for using weak gravitational lensing and large-scale structures of galaxy catalogs to study cosmology. Spectroscopic redshifts for the dim and numerous galaxies of next-generation weak-lensing surveys are expected to be unavailable, making photometric redshift (photo-z) probability density functions (PDFs) the next best alternative for comprehensively encapsulating the nontrivial systematics affecting photo-zpoint estimation. The established stacked estimator ofn(z) avoids reducing photo-zPDFs to point estimates but yields a systematically biased estimate ofn(z) that worsens with a decreasing signal-to-noise ratio, the very regime where photo-zPDFs are most necessary. We introduce Cosmological Hierarchical Inference with Probabilistic Photometric Redshifts (CHIPPR), a statistically rigorous probabilistic graphical model of redshift-dependent photometry that correctly propagates the redshift uncertainty information beyond the best-fit estimator ofn(z) produced by traditional procedures and is provably the only self-consistent way to recovern(z) from photo-zPDFs. We present thechipprprototype code, noting that the mathematically justifiable approach incurs computational cost. TheCHIPPRapproach is applicable to any one-point statistic of any random variable, provided the prior probability density used to produce the posteriors is explicitly known; if the prior is implicit, as may be the case for popular photo-ztechniques, then the resulting posterior PDFs cannot be used formore »scientific inference. We therefore recommend that the photo-zcommunity focus on developing methodologies that enable the recovery of photo-zlikelihoods with support over all redshifts, either directly or via a known prior probability density.

    « less
  2. ABSTRACT We introduce a probabilistic approach to select 6 ≤ $z$ ≤ 8 quasar candidates for spectroscopic follow-up, which is based on density estimation in the high-dimensional space inhabited by the optical and near-infrared photometry. Densities are modelled as Gaussian mixtures with principled accounting of errors using the extreme deconvolution (XD) technique, generalizing an approach successfully used to select lower redshift ($z$ ≤ 3) quasars. We train the probability density of contaminants on 1902 071 7-d flux measurements from the 1076 deg2 overlapping area from the Dark Energy Camera Legacy Survey (DECaLS) ($z$), VIKING (YJHKs), and unWISE (W1W2) imaging surveys, after requiring they dropout of DECaLS g and r, whereas the distribution of high-$z$ quasars are trained on synthetic model photometry. Extensive simulations based on these density distributions and current estimates of the quasar luminosity function indicate that this method achieves a completeness of $\ge 56{{\ \rm per\ cent}}$ and an efficiency of $\ge 5{{\ \rm per\ cent}}$ for selecting quasars at 6 < $z$ < 8 with JAB < 21.5. Among the classified sources are 8 known 6 < $z$ < 7 quasars, of which 2/8 are selected suggesting a completeness $\simeq 25{{\ \rm per\ cent}}$, whereas classifying the 6 knownmore »(JAB < 21.5) quasars at $z$ > 7 from the entire sky, we select 5/6 or a completeness of $\simeq 80{{\ \rm per\ cent}}$. The failure to select the majority of 6 < $z$ < 7 quasars arises because our quasar density model is based on an empirical quasar spectral energy distribution model that underestimates the scatter in the distribution of fluxes. This new approach to quasar selection paves the way for efficient spectroscopic follow-up of Euclid quasar candidates with ground-based telescopes and James Webb Space Telescope.« less
  3. ABSTRACT

    We present a mock image catalogue of ∼100 000 MUV ≃ −22.5 to −19.6 mag galaxies at z = 7–12 from the bluetides cosmological simulation. We create mock images of each galaxy with the James Webb Space Telescope (JWST), Hubble, Roman, and Euclid Space Telescopes, as well as Subaru, and VISTA, with a range of near- and mid-infrared filters. We perform photometry on the mock images to estimate the success of these instruments for detecting high-z galaxies. We predict that JWST will have unprecedented power in detecting high-z galaxies, with a 95 per cent completeness limit at least 2.5 mag fainter than VISTA and Subaru, 1.1 mag fainter than Hubble, and 0.9 mag fainter than Roman, for the same wavelength and exposure time. Focusing on JWST, we consider a range of exposure times and filters, and find that the NIRCam F356W and F277W filters will detect the faintest galaxies, with 95 per cent completeness at m ≃ 27.4 mag in 10-ks exposures. We also predict the number of high-z galaxies that will be discovered by upcoming JWST imaging surveys. We predict that the COSMOS-Web survey will detect ∼1000 M1500 Å < −20.1 mag galaxies at 6.5 < z < 7.5, by virtue of its large surveymore »area. JADES-Medium will detect almost $100{{\ \rm per\ cent}}$ of M1500 Å ≲ −20 mag galaxies at z < 8.5 due to its significant depth, however, with its smaller survey area it will detect only ∼100 of these galaxies at 6.5 < z < 7.5. Cosmic variance results in a large range in the number of predicted galaxies each survey will detect, which is more evident in smaller surveys such as CEERS and the PEARLS NEP and GOODS-S fields.

    « less
  4. ABSTRACT

    We present a catalogue of 4499 groups and clusters of galaxies from the first data release of the multi-filter (5 broad, 7 narrow) Southern Photometric Local Universe Survey (S-PLUS). These groups and clusters are distributed over 273 deg2 in the Stripe 82 region. They are found using the PzWav algorithm, which identifies peaks in galaxy density maps that have been smoothed by a cluster scale difference-of-Gaussians kernel to isolate clusters and groups. Using a simulation-based mock catalogue, we estimate the purity and completeness of cluster detections: at S/N > 3.3, we define a catalogue that is 80 per cent pure and complete in the redshift range 0.1 < z < 0.4, for clusters with M200 > 1014 M⊙. We also assessed the accuracy of the catalogue in terms of central positions and redshifts, finding scatter of σR = 12 kpc and σz = 8.8 × 10−3, respectively. Moreover, less than 1 per cent of the sample suffers from fragmentation or overmerging. The S-PLUS cluster catalogue recovers ∼80 per cent of all known X-ray and Sunyaev-Zel’dovich selected clusters in this field. This fraction is very close to the estimated completeness, thus validating the mock data analysis and paving an efficient way to find new groups and clusters of galaxies using data from themore »ongoing S-PLUS project. When complete, S-PLUS will have surveyed 9300 deg2 of the sky, representing the widest uninterrupted areas with narrow-through-broad multi-band photometry for cluster follow-up studies.

    « less
  5. ABSTRACT We study the projected spatial offset between the ultraviolet continuum and Ly α emission for 65 lensed and unlensed galaxies in the Epoch of Reionization (5 ≤ z ≤ 7), the first such study at these redshifts, in order to understand the potential for these offsets to confuse estimates of the Ly α properties of galaxies observed in slit spectroscopy. While we find that ∼40 per cent of galaxies in our sample show significant projected spatial offsets ($|\Delta _{\rm {Ly}\alpha -\rm {UV}}|$), we find a relatively modest average projected offset of $|\widetilde{\Delta }_{\rm {Ly}\alpha -\rm {UV}}|$  = 0.61 ± 0.08 proper kpc for the entire sample. A small fraction of our sample, ∼10 per cent, exhibit offsets in excess of 2 proper kpc, with offsets seen up to ∼4 proper kpc, sizes that are considerably larger than the effective radii of typical galaxies at these redshifts. An internal comparison and a comparison to studies at lower redshift yielded no significant evidence of evolution of $|\Delta _{\rm {Ly}\alpha -\rm {UV}}|$ with redshift. In our sample, ultraviolet (UV)-bright galaxies ($\widetilde{L_{\mathrm{ UV}}}/L^{\ast }_{\mathrm{ UV}}=0.67$) showed offsets a factor of three greater than their fainter counterparts ($\widetilde{L_{\mathrm{ UV}}}/L^{\ast }_{\mathrm{ UV}}=0.10$), 0.89 ± 0.18 versus 0.27 ± 0.05 proper kpc, respectively. The presence of companion galaxies and early stage merging activitymore »appeared to be unlikely causes of these offsets. Rather, these offsets appear consistent with a scenario in which internal anisotropic processes resulting from stellar feedback, which is stronger in UV-brighter galaxies, facilitate Ly α fluorescence and/or backscattering from nearby or outflowing gas. The reduction in the Ly α flux due to offsets was quantified. It was found that the differential loss of Ly α photons for galaxies with average offsets is not, if corrected for, a limiting factor for all but the narrowest slit widths (<0.4 arcsec). However, for the largest offsets, if they are mostly perpendicular to the slit major axis, slit losses were found to be extremely severe in cases where slit widths of ≤1 arcsec were employed, such as those planned for James Webb Space Telescope/NIRSpec observations.« less