skip to main content


Title: Automated measurement of quasar redshift with a Gaussian process
ABSTRACT We develop an automated technique to measure quasar redshifts in the Baryon Oscillation Spectroscopic Survey of the Sloan Digital Sky Survey (SDSS). Our technique is an extension of an earlier Gaussian process method for detecting damped Lyman α absorbers (DLAs) in quasar spectra with known redshifts. We apply this technique to a subsample of SDSS DR12 with BAL quasars removed and redshift larger than 2.15. We show that we are broadly competitive to existing quasar redshift estimators, disagreeing with the PCA redshift by more than 0.5 in only $0.38{{\ \rm per\ cent}}$ of spectra. Our method produces a probabilistic density function for the quasar redshift, allowing quasar redshift uncertainty to be propagated to downstream users. We apply this method to detecting DLAs, accounting in a Bayesian fashion for redshift uncertainty. Compared to our earlier method with a known quasar redshift, we have a moderate decrease in our ability to detect DLAs, predominantly in the noisiest spectra. The area under curve drops from 0.96 to 0.91. Our code is publicly available.  more » « less
Award ID(s):
1845434
NSF-PAR ID:
10216845
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
Monthly Notices of the Royal Astronomical Society
Volume:
498
Issue:
4
ISSN:
0035-8711
Page Range / eLocation ID:
5227 to 5239
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. ABSTRACT

    Quasar absorption line analysis is critical for studying gas and dust components and their physical and chemical properties as well as the evolution and formation of galaxies in the early universe. Calcium II (Ca ii) absorbers, which are one of the dustiest absorbers and are located at lower redshifts than most other absorbers, are especially valuable when studying physical processes and conditions in recent galaxies. However, the number of known quasar Ca ii absorbers is relatively low due to the difficulty of detecting them with traditional methods. In this work, we developed an accurate and quick approach to search for Ca ii absorption lines using deep learning. In our deep learning model, a convolutional neural network, tuned using simulated data, is used for the classification task. The simulated training data are generated by inserting artificial Ca ii absorption lines into original quasar spectra from the Sloan Digital Sky Survey (SDSS), while an existing Ca ii catalogue is adopted as the test set. The resulting model achieves an accuracy of 96 per cent on the real data in the test set. Our solution runs thousands of times faster than traditional methods, taking a fraction of a second to analyse thousands of quasars, while traditional methods may take days to weeks. The trained neural network is applied to quasar spectra from SDSS’s DR7 and DR12 and discovered 399 new quasar Ca ii absorbers. In addition, we confirmed 409 known quasar Ca ii absorbers identified previously by other research groups through traditional methods.

     
    more » « less
  2. ABSTRACT

    We assemble the largest C iv absorption line catalogue to date, leveraging machine learning, specifically Gaussian processes, to remove the need for visual inspection for detecting C iv absorbers. The catalogue contains probabilities classifying the reliability of the absorption system within a quasar spectrum. Our training set was a sub-sample of DR7 spectra that had no detectable C iv absorption in a large visually inspected catalogue. We used Bayesian model selection to decide between our continuum model and our absorption-line models. Using a random hold-out sample of 1301 spectra from all of the 26 030 investigated spectra in DR7 C iv catalogue, we validated our pipeline and obtained an 87 per cent classification performance score. We found good purity and completeness values, both $\sim 80{{\ \rm per\ cent}}$, when a probability of $\sim 95{{\ \rm per\ cent}}$ is used as the threshold. Our pipeline obtained similar C iv redshifts and rest equivalent widths to our training set. Applying our algorithm to 185 425 selected quasar spectra from SDSS DR12, we produce a catalogue of 113 775 C iv doublets with at least 95 per cent confidence. Our catalogue provides maximum a posteriori values and credible intervals for C iv redshift, column density, and Doppler velocity dispersion. We detect C iv absorption systems with a redshift range of 1.37–5.1, including 33 systems with a redshift larger than 5 and 549 absorbers systems with a rest equivalent width greater than 2 Å at more than 95 per cent confidence. Our catalogue can be used to investigate the physical properties of the circumgalactic and intergalactic media.

     
    more » « less
  3. null (Ed.)
    ABSTRACT We present a new catalogue of Damped Lyman-α absorbers from SDSS DR16Q, as well as new estimates of their statistical properties. Our estimates are computed with the Gaussian process models presented in Garnett et al., Ho, Bird & Garnett with an improved model for marginalizing uncertainty in the mean optical depth of each quasar. We compute the column density distribution function (CDDF) at 2 < z < 5, the line density (dN/dX), and the neutral hydrogen density (ΩDLA). Our Gaussian process model provides a posterior probability distribution of the number of DLAs per spectrum, thus allowing unbiased probabilistic predictions of the statistics of DLA populations even with the noisiest data. We measure a non-zero column density distribution function for $N_{\rm {HI}}\lt 3 \times 10^{22} \, \rm {cm}^{-2}$ with $95{{\ \rm per\ cent}}$ confidence limits, and $N_{\rm {HI}}\lesssim 10^{22} \, \rm {cm}^{-2}$ for spectra with signal-to-noise ratios >4. Our results for DLA line density and total hydrogen density are consistent with previous measurements. Despite a small bias due to the poorly measured blue edges of the spectra, we demonstrate that our new model can measure the DLA population statistics when the DLA is in the Lyman-β forest region. We verify our results are not sensitive to the signal-to-noise ratios and redshifts of the background quasars although a residual correlation remains for detections from zQSO < 2.5, indicating some residual systematics when applying our models on very short spectra, where the SDSS spectral observing window only covers part of the Lyman-α forest. 
    more » « less
  4. null (Ed.)
    ABSTRACT We present a revised version of our automated technique using Gaussian processes (GPs) to detect damped Lyman α absorbers (DLAs) along quasar (QSO) sightlines. The main improvement is to allow our GP pipeline to detect multiple DLAs along a single sightline. Our DLA detections are regularized by an improved model for the absorption from the Lyman α forest that improves performance at high redshift. We also introduce a model for unresolved sub-DLAs that reduces misclassifications of absorbers without detectable damping wings. We compare our results to those of two different large-scale DLA catalogues and provide a catalogue of the processed results of our GP pipeline using 158 825 Lyman α spectra from SDSS data release 12. We present updated estimates for the statistical properties of DLAs, including the column density distribution function, line density (dN/dX), and neutral hydrogen density (ΩDLA). 
    more » « less
  5. ABSTRACT

    We report discoveries of 165 new quasar Ca ii absorbers from the Sloan Digital Sky Survey (SDSS) Data Releases 7 and 12. Our ca ii rest-frame equivalent width distribution supports the weak and strong subpopulations, split at ${W}^{\lambda 3934}_{0}=0.7$ Å. Comparison of both populations’ dust depletion shows clear consistency for weak absorber association with halo-type gas in the Milky Way (MW), while strong absorbers have environments consistent with halo and disc-type gas. We probed our high-redshift Ca ii absorbers for 2175 Å dust bumps, discovering 12 2175 Å dust absorbers (2DAs). This clearly shows that some Ca ii absorbers follow the Large Magellanic Cloud (LMC) extinction law rather than the Small Magellanic Cloud extinction law. About 33 per cent of our strong Ca ii absorbers exhibit the 2175 Å dust bump, while only 6 per cent of weak Ca ii absorbers show this bump. 2DA detection further supports the theory that strong Ca ii absorbers are associated with disc components and are dustier than the weak population. Comparing average Ca ii absorber dust depletion patterns to that of Damped Ly α absorbers (DLAs), Mg ii absorbers, and 2DAs shows that Ca ii absorbers generally have environments with more dust than DLAs and Mg ii absorbers, but less dust than 2DAs. Comparing 2175 Å dust bump strengths from different samples and also the MW and LMC, the bump strength appears to grow stronger as the redshift decreases, indicating dust growth and the global chemical enrichment of galaxies in the Universe over time.

     
    more » « less