Anomaly Detection in Scientific Datasets using Sparse Representation

Moon, Aekyeung; Kim, Minjun; Chen, Jiaxi; Son, Seung Woo

doi:10.1145/3588982.3603610

Citation Details

Anomaly Detection in Scientific Datasets using Sparse Representation

As the size and complexity of high-performance computing (HPC) systems keep growing, scientists' ability to trust the data produced is paramount due to potential data corruption for various reasons, which may stay undetected. While employing machine learning-based anomaly detection techniques could relieve scientists of such concern, it is practically infeasible due to the need for labels for volumes of scientific datasets and the unwanted extra overhead associated. In this paper, we exploit spatial sparsity profiles exhibited in scientific datasets and propose an approach to detect anomalies effectively. Our method first extracts block-level sparse representations of original datasets in the transformed domain. Then it learns from the extracted sparse representations and builds the boundary threshold between normal and abnormal without relying on labeled data. Experiments using real-world scientific datasets show that the proposed approach requires 13% on average (less than 10% in most cases and as low as 0.3%) of the entire dataset to achieve competitive detection accuracy (70.74%-100.0%) as compared to two state-of-the-art unsupervised techniques. more »

Award ID(s):: 2312982 1751143

PAR ID:: 10441651

Author(s) / Creator(s):: Moon, Aekyeung; Kim, Minjun; Chen, Jiaxi; Son, Seung Woo

Date Published:: 2023-08-10

Journal Name:: Proceedings of the First Workshop on AI for Systems

Page Range / eLocation ID:: 13 to 18

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1145/3588982.3603610

More Like this