Deep Set Auto Encoders for Anomaly Detection in Particle Physics
There is an increased interest in model agnostic search strategies for physics beyond the standard model at the Large Hadron Collider.We introduce a Deep Set Variational Autoencoder and present results on the Dark Machines Anomaly Score Challenge.We find that the method attains the best anomaly detection ability when there is no decoding step for the network, and the anomaly score is based solely on the representation within the encoded latent space.This method was one of the top-performing models in the Dark Machines Challenge, both for the open data sets as well as the blinded data sets.
Authors:
Award ID(s):
Publication Date:
NSF-PAR ID:
10323040
Journal Name:
SciPost Physics
Volume:
12
Issue:
1
ISSN:
2542-4653
We present the Swimmy (Subaru WIde-field Machine-learning anoMalY) survey program, a deep-learning-based search for unique sources using multicolored (grizy) imaging data from the Hyper Suprime-Cam Subaru Strategic Program (HSC-SSP). This program aims to detect unexpected, novel, and rare populations and phenomena, by utilizing the deep imaging data acquired from the wide-field coverage of the HSC-SSP. This article, as the first paper in the Swimmy series, describes an anomaly detection technique to select unique populations as “outliers” from the data-set. The model was tested with known extreme emission-line galaxies (XELGs) and quasars, which consequently confirmed that the proposed method successfully selected $\sim\!\! 60\%$–$70\%$ of the quasars and $60\%$ of the XELGs without labeled training data. In reference to the spectral information of local galaxies at z = 0.05–0.2 obtained from the Sloan Digital Sky Survey, we investigated the physical properties of the selected anomalies and compared them based on the significance of their outlier values. The results revealed that XELGs constitute notable fractions of the most anomalous galaxies, and certain galaxies manifest unique morphological features. In summary, deep anomaly detection is an effective tool that can search rare objects, and, ultimately, unknown unknowns with large data-sets. Further development of themore »