skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Hierarchical Cross-entropy Loss for Classification of Astrophysical Transients
Astrophysical transient phenomena are traditionally classified spectroscopically in a hierarchical taxonomy; however, this graph structure is currently not utilized in neural net-based photometric classifiers for time-domain astrophysics. Instead, independent classifiers are trained for different tiers of classified data, and events are excluded if they fall outside of these well-defined but flat classification schemes. Here, we introduce a weighted hierarchical cross-entropy objective function for classification of astrophysical transients. Our method allows users to directly build and use physics- or observationally-motivated tree-based taxonomies. Our weighted hierarchical cross-entropy loss directly uses this graph to accurately classify all targets into any node of the tree, re-weighting imbalanced classes. We test our novel loss on a set of variable stars and extragalactic transients from the Zwicky Transient Facility, showing that we can achieve similar performance to fine-tuned classifiers with the advantage of notably more flexibility in downstream classification tasks.  more » « less
Award ID(s):
2433718
PAR ID:
10538376
Author(s) / Creator(s):
; ;
Publisher / Repository:
Neurips
Date Published:
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract With the advent of the Vera C. Rubin Observatory, the discovery rate of supernovae (SNe) will surpass the rate of SNe with real time spectroscopic follow-up by 3 orders of magnitude. Accurate photometric classifiers are essential to both select interesting events for follow-up in real time and for archival population-level studies. In this work, we investigate the impact of observable host-galaxy information on the classification of SNe, both with and without additional light-curve and redshift information. We find that host-galaxy information alone can successfully isolate relatively pure (>90%) samples of Type Ia SNe with or without redshift information. With redshift information, we can additionally produce somewhat pure (>70%) samples of Type II SNe and superluminous SNe. Additionally with redshift information, host-galaxy properties do not significantly improve the accuracy of SN classification when paired with complete light curves. In the absence of redshift information, however, galaxy properties significantly increase the accuracy of photometric classification. As a part of this analysis, we present the first formal application of a new objective function, the weighted hierarchical cross entropy, to the problem of SN classification. This objective function more naturally accounts for the hierarchical nature of SN classes and, more broadly, transients. Finally, we present a new set of SN classifications for the Pan-STARRS Medium Deep Survey of SNe that lack spectroscopic redshift, increasing the full photometric sample to >4400 events. 
    more » « less
  2. Zelinski, Michael E.; Taha, Tarek M.; Howe, Jonathan (Ed.)
    Image classification forms an important class of problems in machine learning and is widely used in many realworld applications, such as medicine, ecology, astronomy, and defense. Convolutional neural networks (CNNs) are machine learning techniques designed for inputs with grid structures, e.g., images, whose features are spatially correlated. As such, CNNs have been demonstrated to be highly effective approaches for many image classification problems and have consistently outperformed other approaches in many image classification and object detection competitions. A particular challenge involved in using machine learning for classifying images is measurement data loss in the form of missing pixels, which occurs in settings where scene occlusions are present or where the photodetectors in the imaging system are partially damaged. In such cases, the performance of CNN models tends to deteriorate or becomes unreliable even when the perturbations to the input image are small. In this work, we investigate techniques for improving the performance of CNN models for image classification with missing data. In particular, we explore training on a variety of data alterations that mimic data loss for producing more robust classifiers. By optimizing the categorical cross-entropy loss function, we demonstrate through numerical experiments on the MNIST dataset that training with these synthetic alterations can enhance the classification accuracy of our CNN models. 
    more » « less
  3. Over the past decade wide-field optical time-domain surveys have increased the discovery rate of transients to the point that ≲10% are being spectroscopically classified. Despite this, these surveys have enabled the discovery of new and rare types of transients, most notably the class of hydrogen-poor superluminous supernovae (SLSN-I), with about 150 events confirmed to date. Here we present a machine-learning classification algorithm targeted at rapid identification of a pure sample of SLSN-I to enable spectroscopic and multiwavelength follow-up. This algorithm is part of the Finding Luminous and Exotic Extragalactic Transients (FLEET) observational strategy. It utilizes both light-curve and contextual information, but without the need for a redshift, to assign each newly discovered transient a probability of being a SLSN-I. This classifier can achieve a maximum purity of about 85% (with 20% completeness) when observing a selection of SLSN-I candidates. Additionally, we present two alternative classifiers that use either redshifts or complete light curves and can achieve an even higher purity and completeness. At the current discovery rate, the FLEET algorithm can provide about 20 SLSN-I candidates per year for spectroscopic follow-up with 85% purity; with the Legacy Survey of Space and Time we anticipate this will rise to more than $$\sim {10}^{3}$$ events per year. 
    more » « less
  4. Noise of non-astrophysical origin contaminates science data taken by the Advanced Laser Interferometer Gravitational-wave Observatory and Advanced Virgo gravitational-wave detectors. Characterization of instrumental and environmental noise transients has proven critical in identifying false positives in the first aLIGO observing run O1. In this talk, we present three algorithms designed for the automatic classification of non-astrophysical transients in advanced detectors. Principal Component Analysis for Transients (PCAT) and an adaptation of LALInference Burst (PC-LIB) are based on Principal Component Analysis. The third algorithm is a combination of a glitch finder called Wavelet Detection Filter (WDF) and unsupervised machine learning techniques for classification. 
    more » « less
  5. Deep neural networks have achieved significant success in the last decades, but they are not well-calibrated and often produce unreliable predictions. A large number of literature relies on uncertainty quantification to evaluate the reliability of a learning model, which is particularly important for applications of out-of-distribution (OOD) detection and misclassification detection. We are interested in uncertainty quantification for interdependent node-level classification. We start our analysis based on graph posterior networks (GPNs) that optimize the uncertainty cross-entropy (UCE)-based loss function. We describe the theoretical limitations of the widely-used UCE loss. To alleviate the identified drawbacks, we propose a distance-based regularization that encourages clustered OOD nodes to remain clustered in the latent space. We conduct extensive comparison experiments on eight standard datasets and demonstrate that the proposed regularization outperforms the state-of-the-art in both OOD detection and misclassification detection. 
    more » « less