Star cluster classification in the PHANGS– HST survey: Comparison between human and machine learning approaches
ABSTRACT When completed, the PHANGS–HST project will provide a census of roughly 50 000 compact star clusters and associations, as well as human morphological classifications for roughly 20 000 of those objects. These large numbers motivated the development of a more objective and repeatable method to help perform source classifications. In this paper, we consider the results for five PHANGS–HST galaxies (NGC 628, NGC 1433, NGC 1566, NGC 3351, NGC 3627) using classifications from two convolutional neural network architectures (RESNET and VGG) trained using deep transfer learning techniques. The results are compared to classifications performed by humans. The primary result is that the neural network classifications are comparable in quality to the human classifications with typical agreement around 70 to 80 per cent for Class 1 clusters (symmetric, centrally concentrated) and 40 to 70 per cent for Class 2 clusters (asymmetric, centrally concentrated). If Class 1 and 2 are considered together the agreement is 82 ± 3 per cent. Dependencies on magnitudes, crowding, and background surface brightness are examined. A detailed description of the criteria and methodology used for the human classifications is included along with an examination of systematic differences between PHANGS–HST and LEGUS. The distribution of data points in a colour–colour diagram is used as a ‘figure of more »
Authors:
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more »
Award ID(s):
Publication Date:
NSF-PAR ID:
10287464
Journal Name:
Monthly Notices of the Royal Astronomical Society
Volume:
506
Issue:
4
Page Range or eLocation-ID:
5294 to 5317
ISSN:
0035-8711
1. ABSTRACT We present the results of a proof-of-concept experiment that demonstrates that deep learning can successfully be used for production-scale classification of compact star clusters detected in Hubble Space Telescope(HST) ultraviolet-optical imaging of nearby spiral galaxies ($D\lesssim 20\, \textrm{Mpc}$) in the Physics at High Angular Resolution in Nearby GalaxieS (PHANGS)–HST survey. Given the relatively small nature of existing, human-labelled star cluster samples, we transfer the knowledge of state-of-the-art neural network models for real-object recognition to classify star clusters candidates into four morphological classes. We perform a series of experiments to determine the dependence of classification performance on neural network architecture (ResNet18 and VGG19-BN), training data sets curated by either a single expert or three astronomers, and the size of the images used for training. We find that the overall classification accuracies are not significantly affected by these choices. The networks are used to classify star cluster candidates in the PHANGS–HST galaxy NGC 1559, which was not included in the training samples. The resulting prediction accuracies are 70 per cent, 40 per cent, 40–50 per cent, and 50–70 per cent for class 1, 2, 3 star clusters, and class 4 non-clusters, respectively. This performance is competitive with consistency achieved in previously published human and automated quantitative classification of starmore »
3. ABSTRACT We explore unsupervised machine learning for galaxy morphology analyses using a combination of feature extraction with a vector-quantized variational autoencoder (VQ-VAE) and hierarchical clustering (HC). We propose a new methodology that includes: (1) consideration of the clustering performance simultaneously when learning features from images; (2) allowing for various distance thresholds within the HC algorithm; (3) using the galaxy orientation to determine the number of clusters. This set-up provides 27 clusters created with this unsupervised learning that we show are well separated based on galaxy shape and structure (e.g. Sérsic index, concentration, asymmetry, Gini coefficient). These resulting clusters also correlate well with physical properties such as the colour–magnitude diagram, and span the range of scaling relations such as mass versus size amongst the different machine-defined clusters. When we merge these multiple clusters into two large preliminary clusters to provide a binary classification, an accuracy of $\sim 87{{\ \rm per\ cent}}$ is reached using an imbalanced data set, matching real galaxy distributions, which includes 22.7 per cent early-type galaxies and 77.3 per cent late-type galaxies. Comparing the given clusters with classic Hubble types (ellipticals, lenticulars, early spirals, late spirals, and irregulars), we show that there is an intrinsic vagueness in visual classification systems, in particularmore »
5. ABSTRACT We analyse the cold dark matter density profiles of 54 galaxy haloes simulated with Feedback In Realistic Environments (FIRE)-2 galaxy formation physics, each resolved within $0.5{{\ \rm per\ cent}}$ of the halo virial radius. These haloes contain galaxies with masses that range from ultrafaint dwarfs ($M_\star \simeq 10^{4.5}\, \mathrm{M}_{\odot }$) to the largest spirals ($M_\star \simeq 10^{11}\, \mathrm{M}_{\odot }$) and have density profiles that are both cored and cuspy. We characterize our results using a new, analytic density profile that extends the standard two-parameter Einasto form to allow for a pronounced constant density core in the resolved innermost radius. With one additional core-radius parameter, rc, this three-parameter core-Einasto profile is able to characterize our feedback-impacted dark matter haloes more accurately than other three-parameter profiles proposed in the literature. To enable comparisons with observations, we provide fitting functions for rc and other profile parameters as a function of both M⋆ and M⋆/Mhalo. In agreement with past studies, we find that dark matter core formation is most efficient at the characteristic stellar-to-halo mass ratio M⋆/Mhalo ≃ 5 × 10−3, or $M_{\star } \sim 10^9 \, \mathrm{M}_{\odot }$, with cores that are roughly the size of the galaxy half-light radius, rc ≃ 1−5 kpc. Furthermore,more »