Factor analysis of mixed data for anomaly detection

Davidow, Matthew; Matteson, David S

doi:10.1002/sam.11585

Citation Details

Factor analysis of mixed data for anomaly detection

Abstract Anomaly detection aims to identify observations that deviate from the typical pattern of data. Anomalous observations may correspond to financial fraud, health risks, or incorrectly measured data in practice. We focus on unsupervised detection and the continuous and categorical (mixed) variable case. We show that detecting anomalies in mixed data is enhanced through first embedding the data then assessing an anomaly scoring scheme. We propose a kurtosis‐weightedFactor Analysis of Mixed Datafor anomaly detection to obtain a continuous embedding for anomaly scoring. We illustrate that anomalies are highly separable in the first and last few ordered dimensions of this space, and test various anomaly scoring experiments within this subspace. Results are illustrated for both simulated and real datasets, and the proposed approach is highly accurate for mixed data throughout these diverse scenarios. more »

Award ID(s):: 1455172 1934985 1940124 1940276

PAR ID:: 10580839

Author(s) / Creator(s):: Davidow, Matthew; Matteson, David S

Publisher / Repository:: Wiley

Date Published:: 2022-08-01

Journal Name:: Statistical Analysis and Data Mining: The ASA Data Science Journal

Volume:: 15

Issue:: 4

ISSN:: 1932-1864

Page Range / eLocation ID:: 480 to 493

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1002/sam.11585

More Like this