skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Star cluster classification in the PHANGS– HST survey: Comparison between human and machine learning approaches
ABSTRACT When completed, the PHANGS–HST project will provide a census of roughly 50 000 compact star clusters and associations, as well as human morphological classifications for roughly 20 000 of those objects. These large numbers motivated the development of a more objective and repeatable method to help perform source classifications. In this paper, we consider the results for five PHANGS–HST galaxies (NGC 628, NGC 1433, NGC 1566, NGC 3351, NGC 3627) using classifications from two convolutional neural network architectures (RESNET and VGG) trained using deep transfer learning techniques. The results are compared to classifications performed by humans. The primary result is that the neural network classifications are comparable in quality to the human classifications with typical agreement around 70 to 80 per cent for Class 1 clusters (symmetric, centrally concentrated) and 40 to 70 per cent for Class 2 clusters (asymmetric, centrally concentrated). If Class 1 and 2 are considered together the agreement is 82 ± 3 per cent. Dependencies on magnitudes, crowding, and background surface brightness are examined. A detailed description of the criteria and methodology used for the human classifications is included along with an examination of systematic differences between PHANGS–HST and LEGUS. The distribution of data points in a colour–colour diagram is used as a ‘figure of merit’ to further test the relative performances of the different methods. The effects on science results (e.g. determinations of mass and age functions) of using different cluster classification methods are examined and found to be minimal.  more » « less
Award ID(s):
1934757
PAR ID:
10287464
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more » ; ; ; ; « less
Date Published:
Journal Name:
Monthly Notices of the Royal Astronomical Society
Volume:
506
Issue:
4
ISSN:
0035-8711
Page Range / eLocation ID:
5294 to 5317
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. ABSTRACT We present the results of a proof-of-concept experiment that demonstrates that deep learning can successfully be used for production-scale classification of compact star clusters detected in Hubble Space Telescope(HST) ultraviolet-optical imaging of nearby spiral galaxies ($$D\lesssim 20\, \textrm{Mpc}$$) in the Physics at High Angular Resolution in Nearby GalaxieS (PHANGS)–HST survey. Given the relatively small nature of existing, human-labelled star cluster samples, we transfer the knowledge of state-of-the-art neural network models for real-object recognition to classify star clusters candidates into four morphological classes. We perform a series of experiments to determine the dependence of classification performance on neural network architecture (ResNet18 and VGG19-BN), training data sets curated by either a single expert or three astronomers, and the size of the images used for training. We find that the overall classification accuracies are not significantly affected by these choices. The networks are used to classify star cluster candidates in the PHANGS–HST galaxy NGC 1559, which was not included in the training samples. The resulting prediction accuracies are 70 per cent, 40 per cent, 40–50 per cent, and 50–70 per cent for class 1, 2, 3 star clusters, and class 4 non-clusters, respectively. This performance is competitive with consistency achieved in previously published human and automated quantitative classification of star cluster candidate samples (70–80 per cent, 40–50 per cent, 40–50 per cent, and 60–70 per cent). The methods introduced herein lay the foundations to automate classification for star clusters at scale, and exhibit the need to prepare a standardized data set of human-labelled star cluster classifications, agreed upon by a full range of experts in the field, to further improve the performance of the networks introduced in this study. 
    more » « less
  2. ABSTRACT We have identified 189 candidate z > 1.3 protoclusters and clusters in the LSST Deep Drilling Fields. This sample will enable the measurement of the metal enrichment and star formation history of clusters during their early assembly period through the direct measurement of the rate of supernovae identified through the LSST. The protocluster sample was selected from galaxy overdensities in a Spitzer/IRAC colour-selected sample using criteria that were optimized for protocluster purity using a realistic light-cone. Our tests reveal that $$60\!-\!80~{{\ \rm per\ cent}}$$ of the identified candidates are likely to be genuine protoclusters or clusters, which is corroborated by a ∼4σ stacked X-ray signal from these structures. We provide photometric redshift estimates for 47 candidates which exhibit strong peaks in the photo-z distribution of their candidate members. However, the lack of a photo-z peak does not mean a candidate is not genuine, since we find a stacked X-ray signal of similar significance from both the candidates that exhibit photo-z peaks and those that do not. Tests on the light-cone reveal that our pursuit of a pure sample of protoclusters results in that sample being highly incomplete ($$\sim 4~{{\ \rm per\ cent}}$$) and heavily biased towards larger, richer, more massive, and more centrally concentrated protoclusters than the total protocluster population. Most ($$\sim 75~{{\ \rm per\ cent}}$$) of the selected protoclusters are likely to have a maximum collapsed halo mass of between 1013 and 1014 M⊙, with only $$\sim 25~{{\ \rm per\ cent}}$$ likely to be collapsed clusters above 1014 M⊙. However, the aforementioned bias ensures our sample is $$\sim 50~{{\ \rm per\ cent}}$$ complete for structures that have already collapsed into clusters more massive than 1014 M⊙. 
    more » « less
  3. null (Ed.)
    Abstract PHANGS-HST is an ultraviolet-optical imaging survey of 38 spiral galaxies within ∼20 Mpc. Combined with the PHANGS-ALMA, PHANGS-MUSE surveys and other multiwavelength data, the dataset will provide an unprecedented look into the connections between young stars, H ii regions, and cold molecular gas in these nearby star-forming galaxies. Accurate distances are needed to transform measured observables into physical parameters (e.g., brightness to luminosity, angular to physical sizes of molecular clouds, star clusters and associations). PHANGS-HST has obtained parallel ACS imaging of the galaxy halos in the F606W and F814W bands. Where possible, we use these parallel fields to derive tip of the red giant branch (TRGB) distances to these galaxies. In this paper, we present TRGB distances for 11 galaxies from ∼4 to ∼15 Mpc, based on the first year of PHANGS-HST observations. Five of these represent the first published TRGB distance measurements (IC 5332, NGC 2835, NGC 4298, NGC 4321, and NGC 4328), and eight of which are the best available distances to these targets. We also provide a compilation of distances for the 118 galaxies in the full PHANGS sample, which have been adopted for the first PHANGS-ALMA public data release. 
    more » « less
  4. ABSTRACT We analyse Gaia EDR3 and re-calibrated HST proper motion data from the core-collapsed and non-core-collapsed globular clusters NGC 6397 and NGC 3201, respectively, with the Bayesian mass-orbit modelling code MAMPOSSt-PM. We use Bayesian evidence and realistic mock data sets constructed with Agama to select between different mass models. In both clusters, the velocities are consistent with isotropy within the extent of our data. We robustly detect a dark central mass (DCM) of roughly $$1000\, \rm M_\odot$$ in both clusters. Our MAMPOSSt-PM fits strongly prefer an extended DCM in NGC 6397, while only presenting a mild preference for it in NGC 3201, with respective sizes of a roughly one and a few per cent of the cluster effective radius. We explore the astrophysics behind our results with the CMC Monte Carlo N-body code, whose snapshots best matching the phase space observations lead to similar values for the mass and size of the DCM. The internal kinematics are thus consistent with a population of hundreds of massive white dwarfs in NGC 6397, and roughly 100 segregated stellar-mass black holes in NGC 3201, as previously found with CMC. Such analyses confirm the accuracy of both mass-orbit modelling and Monte Carlo N-body techniques, which together provide more robust predictions on the DCM of globular clusters (core-collapsed or not). This opens possibilities to understand a vast range of interesting astrophysical phenomena in clusters, such as fast radio bursts, compact object mergers, and gravitational waves. 
    more » « less
  5. Abstract We present the largest catalog to date of star clusters and compact associations in nearby galaxies. We have performed aV-band-selected census of clusters across the 38 spiral galaxies of the PHANGS–Hubble Space Telescope (HST) Treasury Survey, and measured integrated, aperture-corrected near-ultraviolet-U-B-V-Iphotometry. This work has resulted in uniform catalogs that contain ∼20,000 clusters and compact associations, which have passed human inspection and morphological classification, and a larger sample of ∼100,000 classified by neural network models. Here, we report on the observed properties of these samples, and demonstrate that tremendous insight can be gained from just the observed properties of clusters, even in the absence of their transformation into physical quantities. In particular, we show the utility of the UBVI color–color diagram, and the three principal features revealed by the PHANGS-HST cluster sample: the young cluster locus, the middle-age plume, and the old globular cluster clump. We present an atlas of maps of the 2D spatial distribution of clusters and compact associations in the context of the molecular clouds from PHANGS–Atacama Large Millimeter/submillimeter Array. We explore new ways of understanding this large data set in a multiscale context by bringing together once-separate techniques for the characterization of clusters (color–color diagrams and spatial distributions) and their parent galaxies (galaxy morphology and location relative to the galaxy main sequence). A companion paper presents the physical properties: ages, masses, and dust reddenings derived using improved spectral energy distribution fitting techniques. 
    more » « less