skip to main content

Title: Elastic Depths for Detecting Shape Anomalies in Functional Data
We propose a new family of depth measures called the elastic depths that can be used to greatly improve shape anomaly detection in functional data. Shape anomalies are functions that have considerably different geometric forms or features from the rest of the data. Identifying them is generally more difficult than identifying magnitude anomalies because shape anomalies are often not distinguishable from the bulk of the data with visualization methods. The proposed elastic depths use the recently developed elastic distances to directly measure the centrality of functions in the amplitude and phase spaces. Measuring shape outlyingness in these spaces provides a rigorous quantification of shape, which gives the elastic depths a strong theoretical and practical advantage over other methods in detecting shape anomalies. A simple boxplot and thresholding method is introduced to identify shape anomalies using the elastic depths. We assess the elastic depth’s detection skill on simulated shape outlier scenarios and compare them against popular shape anomaly detectors. Finally, we use hurricane trajectories to demonstrate the elastic depth methodology on manifold valued functional data.
Authors:
; ; ;
Award ID(s):
1922758 1830312
Publication Date:
NSF-PAR ID:
10291130
Journal Name:
Technometrics
Page Range or eLocation-ID:
1 to 11
ISSN:
0040-1706
Sponsoring Org:
National Science Foundation
More Like this
  1. Density estimation is a widely used method to perform unsupervised anomaly detection. By learning the density function, data points with relatively low densities are classified as anomalies. Unfortunately, the presence of anomalies in training data may significantly impact the density estimation process, thereby imposing significant challenges to the use of more sophisticated density estimation methods such as those based on deep neural networks. In this work, we propose RobustRealNVP, a deep density estimation framework that enhances the robustness of flow-based density estimation methods, enabling their application to unsupervised anomaly detection. RobustRealNVP differs from existing flow-based models from two perspectives. First,more »RobustRealNVP discards data points with low estimated densities during optimization to prevent them from corrupting the density estimation process. Furthermore, it imposes Lipschitz regularization to the flow-based model to enforce smoothness in the estimated density function. We demonstrate the robustness of our algorithm against anomalies in training data from both theoretical and empirical perspectives. The results show that our algorithm achieves competitive results as compared to state-of-the-art unsupervised anomaly detection methods.« less
  2. Elastic Riemannian metrics have been used successfully for statistical treatments of functional and curve shape data. However, this usage suffers from a significant restriction: the function boundaries are assumed to be fixed and matched. Functional data often comes with unmatched boundaries, {\it e.g.}, in dynamical systems with variable evolution rates, such as COVID-19 infection rate curves associated with different geographical regions. Here, we develop a Riemannian framework that allows for partial matching, comparing, and clustering functions under phase variability {\it and} uncertain boundaries. We extend past work by (1) Defining a new diffeomorphism group G over the positive reals thatmore »is the semidirect product of a time-warping group and a time-scaling group; (2) Introducing a metric that is invariant to the action of G; (3) Imposing a Riemannian Lie group structure on G to allow for an efficient gradient-based optimization for elastic partial matching; and (4) Presenting a modification that, while losing the metric property, allows one to control the amount of boundary disparity in the registration. We illustrate this framework by registering and clustering shapes of COVID-19 rate curves, identifying basic patterns, minimizing mismatch errors, and reducing variability within clusters compared to previous methods.« less
  3. Anomaly detection aims at identifying data points that show systematic deviations from the major- ity of data in an unlabeled dataset. A common assumption is that clean training data (free of anomalies) is available, which is often violated in practice. We propose a strategy for training an anomaly detector in the presence of unlabeled anomalies that is compatible with a broad class of models. The idea is to jointly infer binary la- bels to each datum (normal vs. anomalous) while updating the model parameters. Inspired by out- lier exposure (Hendrycks et al., 2018) that con- siders synthetically created, labeled anomalies,more »we thereby use a combination of two losses that share parameters: one for the normal and one for the anomalous data. We then iteratively proceed with block coordinate updates on the parameters and the most likely (latent) labels. Our exper- iments with several backbone models on three image datasets, 30 tabular data sets, and a video anomaly detection benchmark showed consistent and significant improvements over the baselines.« less
  4. Network anomaly detection aims to find network elements (e.g., nodes, edges, subgraphs) with significantly different behaviors from the vast majority. It has a profound impact in a variety of applications ranging from finance, healthcare to social network analysis. Due to the unbearable labeling cost, existing methods are predominately developed in an unsupervised manner. Nonetheless, the anomalies they identify may turn out to be data noises or uninteresting data instances due to the lack of prior knowledge on the anomalies of interest. Hence, it is critical to investigate and develop few-shot learning for network anomaly detection. In real-world scenarios, few labeledmore »anomalies are also easy to be accessed on similar networks from the same domain as the target network, while most of the existing works omit to leverage them and merely focus on a single network. Taking advantage of this potential, in this work, we tackle the problem of few-shot network anomaly detection by (1) proposing a new family of graph neural networks -- Graph Deviation Networks (GDN) that can leverage a small number of labeled anomalies for enforcing statistically significant deviations between abnormal and normal nodes on a network; (2) equipping the proposed GDN with a new cross- network meta-learning algorithm to realize few-shot network anomaly detection by transferring meta-knowledge from multiple auxiliary networks. Extensive experimental evaluations demonstrate the efficacy of the proposed approach on few-shot or even one-shot network anomaly detection.« less
  5. SUMMARY Interfaces are important part of Earth’s layering structure. Here, we developed a new model parametrization and iterative linearized inversion method that determines 1-D crustal velocity structure using surface wave dispersion, teleseismic P-wave receiver functions and Ps and PmP traveltimes. Unlike previous joint inversion methods, the new model parametrization includes interface depths and layer Vp/Vs ratios so that smoothness constraint can be conveniently applied to velocities of individual layers without affecting the velocity discontinuity across the interfaces. It also allows adding interface-related observation such as traveltimes of Ps and PmP in the joint inversion to eliminate the trade-off between interfacemore »depth and Vp/Vs ratio and therefore to reduce the uncertainties of results. Numerical tests show that the method is computationally efficient and the inversion results are robust and independent of the initial model. Application of the method to a dense linear array across the Wabash Valley Seismic Zone (WVSZ) produced a high-resolution crustal image in this seismically active region. The results show a 51–55-km-thick crust with a mid-crustal interface at 14–17 km. The crustal Vp/Vs ratio varies from 1.69 to 1.90. There are three pillow-like, ∼100 km apart high-velocity bodies sitting at the base of the crust and directly above each of them are a low-velocity anomaly in the middle crust and a high-velocity anomaly in the upper crust. They are interpreted to be produced by mantle magmatic intrusions and remelting during rifting events in the end of the Precambrian. The current diffuse seismicity in the WVSZ might be rooted in this ancient distributed rifting structure.« less