skip to main content

Title: Revealing the Milky Way’s most recent major merger with a Gaia EDR3 catalogue of machine-learned line-of-sight velocities

Machine learning can play a powerful role in inferring missing line-of-sight velocities from astrometry in surveys such as Gaia. In this paper, we apply a neural network to Gaia Early Data Release 3 (EDR3) and obtain line-of-sight velocities and associated uncertainties for ∼92 million stars. The network, which takes as input a star’s parallax, angular coordinates, and proper motions, is trained and validated on ∼6.4 million stars in Gaia with complete phase-space information. The network’s uncertainty on its velocity prediction is a key aspect of its design; by properly convolving these uncertainties with the inferred velocities, we obtain accurate stellar kinematic distributions. As a first science application, we use the new network-completed catalogue to identify candidate stars that belong to the Milky Way’s most recent major merger, Gaia-Sausage-Enceladus (GSE). We present the kinematic, energy, angular momentum, and spatial distributions of the ∼450 000 GSE candidates in this sample, and also study the chemical abundances of those with cross matches to GALAH and APOGEE. The network’s predictive power will only continue to improve with future Gaia data releases as the training set of stars with complete phase-space information grows. This work provides a first demonstration of how to use machine learning to exploit high-dimensional correlations on data to infer line-of-sight velocities, and offers a template for how to train, validate, and apply such a neural network when complete observational data is not available.

more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Monthly Notices of the Royal Astronomical Society
Medium: X Size: p. 1633-1645
["p. 1633-1645"]
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    ABSTRACT Until the recent advent of Gaia Data Release 2 (DR2) and deep multi-object spectroscopy, it has been difficult to obtain 6D phase space information for large numbers of stars beyond 4 kpc, in particular towards the Galactic Centre, where dust and crowding are significant. We combine line-of-sight velocities from the Abundances and Radial velocity Galactic Origins Survey (ARGOS) with proper motions from Gaia DR2 to obtain a sample of ∼7000 red clump stars with 3D velocities. We perform a large-scale stellar kinematics study of the Milky Way bulge to characterize the bulge velocity ellipsoids in 20 fields. The tilt of the major-axis of the velocity ellipsoid in the radial-longitudinal velocity plane, or vertex deviation, is characteristic of non-axisymmetric systems and a significant tilt is a robust indicator of non-axisymmetry or bar presence. We compare the observations to the predicted kinematics of an N-body boxy-bulge model formed from dynamical instabilities. In the model, the lv values are strongly correlated with the angle (α) between the bulge major-axis and the Sun-Galactic centre line of sight. We use a maximum likelihood method to obtain an independent measurement of α, from bulge stellar kinematics alone, performing a robust error analysis. The most likely value of α given our model is α = (29 ± 3)○, with an additional systematic uncertainty due to comparison with one specific model. In Baade’s window, the metal-rich stars display a larger vertex deviation (lv = −40○) than the metal-poor stars (lv = 10○) but we do not detect significant lv−metallicity trends in the other fields. 
    more » « less

    Understanding local stellar kinematic substructures in the solar neighbourhood helps build a complete picture of the formation of the Milky Way, as well as an empirical phase space distribution of dark matter that would inform detection experiments. We apply the clustering algorithm hdbscan on the Gaia early third data release to identify a list of stable clusters in velocity space and action-angle space by taking into account the measurement uncertainties and studying the stability of the clustering results. We find 1405 (497) stars in 23 (6) robust clusters in velocity space (action-angle space) that are consistently not associated with noise. We discuss the kinematic properties of these structures and study whether many of the small clusters belong to a similar larger cluster based on their chemical abundances. They are attributed to the known structures: the Gaia Sausage-Enceladus, the Helmi Stream, and globular cluster NGC 3201 are found in both spaces, while NGC 104 and the thick disc (Sequoia) are identified in velocity space (action-angle space). Although we do not identify any new structures, we find that the hdbscan member selection of already known structures is unstable to input kinematics of the stars when resampled within their uncertainties. We therefore present the stable subset of local kinematic structures, which are consistently identified by the clustering algorithm, and emphasize the need to take into account error propagation during both the manual and automated identification of stellar structures, both for existing ones as well as future discoveries.

    more » « less
  3. Precise Gaia measurements of positions, parallaxes, and proper motions provide an opportunity to calculate 3D positions and 2D velocities (i.e., 5D phase-space) of Milky Way stars. Where available, spectroscopic radial velocity (RV) measurements provide full 6D phase-space information, however there are now and will remain many stars without RV measurements. Without an RV it is not possible to directly calculate 3D stellar velocities; however, one can infer 3D stellar velocities by marginalizing over the missing RV dimension. In this paper, we infer the 3D velocities of stars in the Kepler field in Cartesian Galactocentric coordinates (vx, vy, vz). We directly calculate velocities for around a quarter of all Kepler targets, using RV measurements available from the Gaia, LAMOST, and APOGEE spectroscopic surveys. Using the velocity distributions of these stars as our prior, we infer velocities for the remaining three quarters of the sample by marginalizing over the RV dimension. The median uncertainties on our inferred vx, vy, and vz velocities are around 4, 18, and 4 km/s, respectively. We provide 3D velocities for a total of 148,590 stars in the Kepler field. These 3D velocities could enable kinematic age-dating, Milky Way stellar population studies, and other scientific studies using the benchmark sample of well-studied Kepler stars. Although the methodology used here is broadly applicable to targets across the sky, our prior is specifically constructed from and for the Kepler field. Care should be taken to use a suitable prior when extending this method to other parts of the Galaxy. 
    more » « less

    We investigate the structure of our Galaxy’s young stellar disc by fitting the distribution functions (DFs) of a new family to 5D Gaia data for a sample of $47\, 000$ OB stars. Tests of the fitting procedure show that the young disc’s DF would be strongly constrained by Gaia data if the distribution of Galactic dust were accurately known. The DF that best fits the real data accurately predicts the kinematics of stars at their observed locations, but it predicts the spatial distribution of stars poorly, almost certainly on account of errors in the best-available dust map. We argue that dust models could be greatly improved by modifying the dust model until the spatial distribution of stars predicted by a DF agreed with the data. The surface density of OB stars is predicted to peak at $R\simeq 5.5\, \mathrm{kpc}$, slightly outside the reported peak in the surface density of molecular gas; we suggest that the latter radius may have been underestimated through the use of poor kinematic distances. The velocity distributions predicted by the best-fitting DF for stars with measured line-of-sight velocities v∥ reveal that the outer disc is disturbed at the level of $10\, \mathrm{km\, s}^{-1}$ in agreement with earlier studies, and that the measured values of v∥ have significant contributions from the orbital velocities of binaries. Hence the outer disc is colder than it is sometimes reported to be.

    more » « less
  5. null (Ed.)
    We present a chemo-dynamical study of the Orphan stellar stream using a catalog of RR Lyrae pulsating variable stars for which photometric, astrometric, and spectroscopic data are available. Employing low-resolution spectra from the Sloan Digital Sky Survey (SDSS), we determined line-of-sight velocities for individual exposures and derived the systemic velocities of the RR Lyrae stars. In combination with the stars’ spectroscopic metallicities and Gaia EDR3 astrometry, we investigated the northern part of the Orphan stream. In our probabilistic approach, we found 20 single mode RR Lyrae variables likely associated with the Orphan stream based on their positions, proper motions, and distances. The acquired sample permitted us to expand our search to nonvariable stars in the SDSS dataset, utilizing line-of-sight velocities determined by the SDSS. We found 54 additional nonvariable stars linked to the Orphan stream. The metallicity distribution for the identified red giant branch stars and blue horizontal branch stars is, on average, −2.13 ± 0.05 dex and −1.87 ± 0.14 dex, with dispersions of 0.23 and 0.43 dex, respectively. The metallicity distribution of the RR Lyrae variables peaks at −1.80 ± 0.06 dex and a dispersion of 0.25 dex. Using the collected stellar sample, we investigated a possible link between the ultra-faint dwarf galaxy Grus II and the Orphan stream. Based on their kinematics, we found that both the stream RR Lyrae and Grus II are on a prograde orbit with similar orbital properties, although the large uncertainties on the dynamical properties render an unambiguous claim of connection difficult. At the same time, the chemical analysis strongly weakens the connection between both. We argue that Grus II in combination with the Orphan stream would have to exhibit a strong inverse metallicity gradient, which to date has not been detected in any Local Group system. 
    more » « less