skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: The ZTF Source Classification Project. III. A Catalog of Variable Sources
Abstract The classification of variable objects provides insight into a wide variety of astrophysics ranging from stellar interiors to galactic nuclei. The Zwicky Transient Facility (ZTF) provides time-series observations that record the variability of more than a billion sources. The scale of these data necessitates automated approaches to make a thorough analysis. Building on previous work, this paper reports the results of the ZTF Source Classification Project (SCoPe), which trains neural network and XGBoost (XGB) machine-learning (ML) algorithms to perform dichotomous classification of variable ZTF sources using a manually constructed training set containing 170,632 light curves. We find that several classifiers achieve high precision and recall scores, suggesting the reliability of their predictions for 209,991,147 light curves across 77 ZTF fields. We also identify the most important features for XGB classification and compare the performance of the two ML algorithms, finding a pattern of higher precision among XGB classifiers. The resulting classification catalog is available to the public, and the software developed forSCoPeis open source and adaptable to future time-domain surveys.  more » « less
Award ID(s):
2049645 2034437 2117997
PAR ID:
10503556
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more » ; ; ; ; ; « less
Publisher / Repository:
DOI PREFIX: 10.3847
Date Published:
Journal Name:
The Astrophysical Journal Supplement Series
Volume:
272
Issue:
1
ISSN:
0067-0049
Format(s):
Medium: X Size: Article No. 14
Size(s):
Article No. 14
Sponsoring Org:
National Science Foundation
More Like this
  1. Aims. We present a variability-, color-, and morphology-based classifier designed to identify multiple classes of transients and persistently variable and non-variable sources from the Zwicky Transient Facility (ZTF) Data Release 11 (DR11) light curves of extended and point sources. The main motivation to develop this model was to identify active galactic nuclei (AGN) at different redshift ranges to be observed by the 4MOST Chilean AGN/Galaxy Evolution Survey (ChANGES). That being said, it also serves as a more general time-domain astronomy study. Methods. The model uses nine colors computed from CatWISE and Pan-STARRS1 (PS1), a morphology score from PS1, and 61 single-band variability features computed from the ZTF DR11 g and r light curves. We trained two versions of the model, one for each ZTF band, since ZTF DR11 treats the light curves observed in a particular combination of field, filter, and charge-coupled device (CCD) quadrant independently. We used a hierarchical local classifier per parent node approach-where each node is composed of a balanced random forest model. We adopted a taxonomy with 17 classes: non-variable stars, non-variable galaxies, three transients (SNIa, SN-other, and CV/Nova), five classes of stochastic variables (lowz-AGN, midz-AGN, highz-AGN, Blazar, and YSO), and seven classes of periodic variables (LPV, EA, EB/EW, DSCT, RRL, CEP, and Periodic-other). Results. The macro-averaged precision, recall, and F1-score are 0.61, 0.75, and 0.62 for the g -band model, and 0.60, 0.74, and 0.61, for the r -band model. When grouping the four AGN classes (lowz-AGN, midz-AGN, highz-AGN, and Blazar) into one single class, its precision-recall, and F1-score are 1.00, 0.95, and 0.97, respectively, for both the g and r bands. This demonstrates the good performance of the model in classifying AGN candidates. We applied the model to all the sources in the ZTF/4MOST overlapping sky (−28 ≤ Dec ≤ 8.5), avoiding ZTF fields that cover the Galactic bulge (| gal_b | ≤ 9 and gal_l ≤ 50). This area includes 86 576 577 light curves in the g band and 140 409 824 in the r band with 20 or more observations and with an average magnitude in the corresponding band lower than 20.5. Only 0.73% of the g -band light curves and 2.62% of the r -band light curves were classified as stochastic, periodic, or transient with high probability ( P init ≥ 0.9). Even though the metrics obtained for the two models are similar, we find that, in general, more reliable results are obtained when using the g -band model. With it, we identified 384 242 AGN candidates (including low-, mid-, and high-redshift AGN and Blazars), 287 156 of which have P init ≥ 0.9. 
    more » « less
  2. Abstract The Bright Transient Survey (BTS) aims to obtain a classification spectrum for all bright (mpeak≤ 18.5 mag) extragalactic transients found in the Zwicky Transient Facility (ZTF) public survey. BTS critically relies on visual inspection (“scanning”) to select targets for spectroscopic follow-up, which, while effective, has required a significant time investment over the past ∼5 yr of ZTF operations. We presentBTSbot, a multimodal convolutional neural network, which provides a bright transient score to individual ZTF detections using their image data and 25 extracted features.BTSbotis able to eliminate the need for daily human scanning by automatically identifying and requesting spectroscopic follow-up observations of new bright transient candidates.BTSbotrecovers all bright transients in our test split and performs on par with scanners in terms of identification speed (on average, ∼1 hr quicker than scanners). We also find thatBTSbotis not significantly impacted by any data shift by comparing performance across a concealed test split and a sample of very recent BTS candidates.BTSbothas been integrated intoFritzandKowalski, ZTF’s first-party marshal and alert broker, and now sends automatic spectroscopic follow-up requests for the new transients it identifies. Between 2023 December and 2024 May,BTSbotselected 609 sources in real time, 96% of which were real extragalactic transients. WithBTSbotand other automation tools, the BTS workflow has produced the first fully automatic end-to-end discovery and classification of a transient, representing a significant reduction in the human time needed to scan. 
    more » « less
  3. Abstract Photometric classifications of supernova (SN) light curves have become necessary to utilize the full potential of large samples of observations obtained from wide-field photometric surveys, such as the Zwicky Transient Facility (ZTF) and the Vera C. Rubin Observatory. Here, we present a photometric classifier for SN light curves that does not rely on redshift information and still maintains comparable accuracy to redshift-dependent classifiers. Our new package, Superphot+, uses a parametric model to extract meaningful features from multiband SN light curves. We train a gradient-boosted machine with fit parameters from 6061 ZTF SNe that pass data quality cuts and are spectroscopically classified as one of five classes: SN Ia, SN II, SN Ib/c, SN IIn, and SLSN-I. Without redshift information, our classifier yields a class-averagedF1-score of 0.61 ± 0.02 and a total accuracy of 0.83 ± 0.01. Including redshift information improves these metrics to 0.71 ± 0.02 and 0.88 ± 0.01, respectively. We assign new class probabilities to 3558 ZTF transients that show SN-like characteristics (based on the ALeRCE Broker light-curve and stamp classifiers) but lack spectroscopic classifications. Finally, we compare our predicted SN labels with those generated by the ALeRCE light-curve classifier, finding that the two classifiers agree on photometric labels for 82% ± 2% of light curves with spectroscopic labels and 72% ± 0% of light curves without spectroscopic labels. Superphot+ is currently classifying ZTF SNe in real time via the ANTARES Broker, and is designed for simple adaptation to six-band Rubin light curves in the future. 
    more » « less
  4. Abstract Optical surveys have become increasingly adept at identifying candidate tidal disruption events (TDEs) in large numbers, but classifying these generally requires extensive spectroscopic resources. Here we presenttdescore, a simple binary photometric classifier that is trained using a systematic census of ∼3000 nuclear transients from the Zwicky Transient Facility (ZTF). The sample is highly imbalanced, with TDEs representing ∼2% of the total.tdescoreis nonetheless able to reject non-TDEs with 99.6% accuracy, yielding a sample of probable TDEs with recall of 77.5% for a precision of 80.2%.tdescoreis thus substantially better than any available TDE photometric classifier scheme in the literature, with performance not far from spectroscopy as a method for classifying ZTF nuclear transients, despite relying solely on ZTF data and multiwavelength catalog cross matching. In a novel extension, we use “Shapley additive explanations” to provide a human-readable justification for each individualtdescoreclassification, enabling users to understand and form opinions about the underlying classifier reasoning.tdescorecan serve as a model for photometric identification of TDEs with time-domain surveys, such as the upcoming Rubin observatory. 
    more » « less
  5. Abstract The rotation period of a star is an important quantity that provides insight into its structure and state. For stars with surface features like starspots, their periods can be inferred from brightness variations as these features move across the stellar surface. TESS, with its all-sky coverage, is providing the largest sample of stars for obtaining rotation periods. However, most of the periods have been limited to shorter than the 13.7 days TESS orbital period due to strong background signals (e.g., scattered light) on those timescales. In this study, we investigated the viability of measuring longer periods (>10 days) from TESS light curves for stars in the Northern Continuous Viewing Zone (NCVZ). We first created a reference set of 272 period measurements longer than 10 days for K and M dwarfs in the NCVZ using data from the Zwicky Transient Facility (ZTF) that we consider as the “ground truth” given ZTF’s long temporal baseline of 6+ years. We then used theunpopularpipeline to detrend TESS light curves and implemented a modified Lomb–Scargle (LS) periodogram that accounts for flux offsets between observing sectors. For 179 out of the 272 sources (66%), the TESS-derived periods match the ZTF-derived periods to within 10%. The match rate increases to 81% (137 out of 170) when restricting to sources with a TESS LS power that exceeds a threshold. Our results confirm the capability of measuring periods longer than 10 days from TESS data, highlighting the data set’s potential for studying slow rotators. 
    more » « less