ABSTRACT Astronomers have typically set out to solve supervised machine learning problems by creating their own representations from scratch. We show that deep learning models trained to answer every Galaxy Zoo DECaLS question learn meaningful semantic representations of galaxies that are useful for new tasks on which the models were never trained. We exploit these representations to outperform several recent approaches at practical tasks crucial for investigating large galaxy samples. The first task is identifying galaxies of similar morphology to a query galaxy. Given a single galaxy assigned a free text tag by humans (e.g. ‘#diffuse’), we can find galaxies matching that tag for most tags. The second task is identifying the most interesting anomalies to a particular researcher. Our approach is 100 per cent accurate at identifying the most interesting 100 anomalies (as judged by Galaxy Zoo 2 volunteers). The third task is adapting a model to solve a new task using only a small number of newly labelled galaxies. Models fine-tuned from our representation are better able to identify ring galaxies than models fine-tuned from terrestrial images (ImageNet) or trained from scratch. We solve each task with very few new labels; either one (for the similarity search) or several hundred (for anomaly detection or fine-tuning). This challenges the longstanding view that deep supervised methods require new large labelled data sets for practical use in astronomy. To help the community benefit from our pretrained models, we release our fine-tuning code zoobot. Zoobot is accessible to researchers with no prior experience in deep learning. 
                        more » 
                        « less   
                    
                            
                            Galaxy Zoo DESI: Detailed morphology measurements for 8.7M galaxies in the DESI Legacy Imaging Surveys
                        
                    
    
            ABSTRACT We present detailed morphology measurements for 8.67 million galaxies in the DESI Legacy Imaging Surveys (DECaLS, MzLS, and BASS, plus DES). These are automated measurements made by deep learning models trained on Galaxy Zoo volunteer votes. Our models typically predict the fraction of volunteers selecting each answer to within 5–10 per cent for every answer to every GZ question. The models are trained on newly collected votes for DESI-LS DR8 images as well as historical votes from GZ DECaLS. We also release the newly collected votes. Extending our morphology measurements outside of the previously released DECaLS/SDSS intersection increases our sky coverage by a factor of 4 (5000–19 000 deg2) and allows for full overlap with complementary surveys including ALFALFA and MaNGA. 
        more » 
        « less   
        
    
    
                            - PAR ID:
- 10470004
- Publisher / Repository:
- Oxford University Press
- Date Published:
- Journal Name:
- Monthly Notices of the Royal Astronomical Society
- Volume:
- 526
- Issue:
- 3
- ISSN:
- 0035-8711
- Format(s):
- Medium: X Size: p. 4768-4786
- Size(s):
- p. 4768-4786
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            ABSTRACT We present Galaxy Zoo DECaLS: detailed visual morphological classifications for Dark Energy Camera Legacy Survey images of galaxies within the SDSS DR8 footprint. Deeper DECaLS images (r = 23.6 versus r = 22.2 from SDSS) reveal spiral arms, weak bars, and tidal features not previously visible in SDSS imaging. To best exploit the greater depth of DECaLS images, volunteers select from a new set of answers designed to improve our sensitivity to mergers and bars. Galaxy Zoo volunteers provide 7.5 million individual classifications over 314 000 galaxies. 140 000 galaxies receive at least 30 classifications, sufficient to accurately measure detailed morphology like bars, and the remainder receive approximately 5. All classifications are used to train an ensemble of Bayesian convolutional neural networks (a state-of-the-art deep learning method) to predict posteriors for the detailed morphology of all 314 000 galaxies. We use active learning to focus our volunteer effort on the galaxies which, if labelled, would be most informative for training our ensemble. When measured against confident volunteer classifications, the trained networks are approximately 99 per cent accurate on every question. Morphology is a fundamental feature of every galaxy; our human and machine classifications are an accurate and detailed resource for understanding how galaxies evolve.more » « less
- 
            Abstract We use luminous red galaxies selected from the imaging surveys that are being used for targeting by the Dark Energy Spectroscopic Instrument (DESI) in combination with CMB lensing maps from the Planck collaboration to probe the amplitude of large-scale structure over 0.4 ≤ z ≤ 1. Our galaxy sample, with an angular number density of approximately 500 deg -2 over 18,000 sq.deg., is divided into 4 tomographic bins by photometric redshift and the redshift distributions are calibrated using spectroscopy from DESI. We fit the galaxy autospectra and galaxy-convergence cross-spectra using models based on cosmological perturbation theory, restricting to large scales that are expected to be well described by such models. Within the context of ΛCDM, combining all 4 samples and using priors on the background cosmology from supernova and baryon acoustic oscillation measurements, we find S 8 = σ 8 (Ω m /0.3) 0.5 = 0.73 ± 0.03. This result is lower than the prediction of the ΛCDM model conditioned on the Planck data. Our data prefer a slower growth of structure at low redshift than the model predictions, though at only modest significance.more » « less
- 
            Abstract We use subhalo abundance and age distribution matching to create magnitude-limited mock galaxy catalogs atz∼ 0.43, 0.52, and 0.63 withz-band and 3.4μmW1-band absolute magnitudes andr−zandr−W1 colors. From these magnitude-limited mocks, we select mock luminous red galaxy (LRG) samples according to the (r−z)-based (optical) and (r−W1)-based (infrared) selection criteria for the LRG sample of the Dark Energy Spectroscopic Instrument (DESI) survey. Our models reproduce the number densities, luminosity functions, color distributions, and projected clustering of the DESI Legacy Surveys that are the basis for DESI LRG target selection. We predict the halo occupation statistics of both optical and IR DESI LRGs at fixed cosmology and assess the differences between the two LRG samples. We find that IR-based SHAM modeling represents the differences between the optical and IR LRG populations better than using thezband and that age distribution matching overpredicts the clustering of LRGs, implying that galaxy color is uncorrelated with halo age in the LRG regime. Both the optical and IR DESI LRG target selections exclude some of the most luminous galaxies that would appear to be LRGs based on their position on the red sequence in optical color–magnitude space. Both selections also yield populations with a nontrivial LRG–halo connection that does not reach unity for the most massive halos. We find that the IR selection achieves greater completeness (≳90%) than the optical selection across all redshift bins studied.more » « less
- 
            ABSTRACT This paper provides a comprehensive overview of how fitting of baryon acoustic oscillations (BAO) is carried out within the upcoming Dark Energy Spectroscopic Instrument’s (DESI) 2024 results using its DR1 data set, and the associated systematic error budget from theory and modelling of the BAO. We derive new results showing how non-linearities in the clustering of galaxies can cause potential biases in measurements of the isotropic ($$\alpha _{\mathrm{iso}}$$) and anisotropic ($$\alpha _{\mathrm{ap}}$$) BAO distance scales, and how these can be effectively removed with an appropriate choice of reconstruction algorithm. We then demonstrate how theory leads to a clear choice for how to model the BAO and develop, implement, and validate a new model for the remaining smooth-broad-band (i.e. without BAO) component of the galaxy clustering. Finally, we explore the impact of all remaining modelling choices on the BAO constraints from DESI using a suite of high-precision simulations, arriving at a set of best practices for DESI BAO fits, and an associated theory and modelling systematic error. Overall, our results demonstrate the remarkable robustness of the BAO to all our modelling choices and motivate a combined theory and modelling systematic error contribution to the post-reconstruction DESI BAO measurements of no more than 0.1 per cent (0.2 per cent) for its isotropic (anisotropic) distance measurements. We expect the theory and best practices laid out to here to be applicable to other BAO experiments in the era of DESI and beyond.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
