skip to main content

Title: Learning Patterns for Detection with Multiscale Scan Statistics
This paper addresses detecting anomalous patterns in images, time-series, and tensor data when the location and scale of the pattern and the pattern itself is unknown a priori. The multiscale scan statistic convolves the proposed pattern with the image at various scales and returns the maximum of the resulting tensor. Scale corrected multiscale scan statistics apply different standardizations at each scale, and the limiting distribution under the null hypothesis---that the data is only noise---is known for smooth patterns. We consider the problem of simultaneously learning and detecting the anomalous pattern from a dictionary of smooth patterns and a database of many tensors. To this end, we show that the multiscale scan statistic is a subexponential random variable, and prove a chaining lemma for standardized suprema, which may be of independent interest. Then by averaging the statistics over the database of tensors we can learn the pattern and obtain Bernstein-type error bounds. We will also provide a construction of an epsilon-net of the location and scale parameters, providing a computationally tractable approximation with similar error bounds.
Award ID(s):
Publication Date:
Journal Name:
Proceedings of Machine Learning Research (31st Annual Conference on Learning Theory))
Sponsoring Org:
National Science Foundation
More Like this
  1. We propose a new approach, the calibrated nonparametric scan statistic (CNSS), for more accurate detection of anomalous patterns in large-scale, real-world graphs. Scan statistics identify connected subgraphs that are interesting or unexpected through maximization of a likelihood ratio statistic; in particular, nonparametric scan statistics (NPSSs) identify subgraphs with a higher than expected proportion of individually significant nodes. However, we show that recently proposed NPSS methods are miscalibrated, failing to account for the maximization of the statistic over the multiplicity of subgraphs. This results in both reduced detection power for subtle signals, and low precision of the detected subgraph even for stronger signals. Thus we develop a new statistical approach to recalibrate NPSSs, correctly adjusting for multiple hypothesis testing and taking the underlying graph structure into account. While the recalibration, based on randomization testing, is computationally expensive, we propose both an efficient (approximate) algorithm and new, closed-form lower bounds (on the expected maximum proportion of significant nodes for subgraphs of a given size, under the null hypothesis of no anomalous patterns). These advances, along with the integration of recent core-tree decomposition methods, enable CNSS to scale to large real-world graphs, with substantial improvement in the accuracy of detected subgraphs. Extensive experimentsmore »on both semi-synthetic and real-world datasets are demonstrated to validate the effectiveness of our proposed methods, in comparison with state-of-the-art counterparts.« less
  2. In many real-world applications of monitoring multivariate spatio-temporal data that are non-stationary over time, one is often interested in detecting hot-spots with spatial sparsity and temporal consistency, instead of detecting system-wise changes as in traditional statistical process control (SPC) literature. In this paper, we propose an efficient method to detect hot-spots through tensor decomposition, and our method has three steps. First, we fit the observed data into a Smooth Sparse Decomposition Tensor (SSD-Tensor) model that serves as a dimension reduction and de-noising technique: it is an additive model decomposing the original data into: smooth but non-stationary global mean, sparse local anomalies, and random noises. Next, we estimate model parameters by the penalized framework that includes Least Absolute Shrinkage and Selection Operator (LASSO) and fused LASSO penalty. An efficient recursive optimization algorithm is developed based on Fast Iterative Shrinkage Thresholding Algorithm (FISTA). Finally, we apply a Cumulative Sum (CUSUM) Control Chart to monitor model residuals after removing global means, which helps to detect when and where hot-spots occur. To demonstrate the usefulness of our proposed SSD-Tensor method, we compare it with several other methods including scan statistics, LASSO-based, PCA-based, T2-based control chart in extensive numerical simulation studies and a real crimemore »rate dataset.« less

    For over 40 yr, the global centroid-moment tensor (GCMT) project has determined location and source parameters for globally recorded earthquakes larger than magnitude 5.0. The GCMT database remains a trusted staple for the geophysical community. Its point-source moment-tensor solutions are the result of inversions that model long-period observed seismic waveforms via normal-mode summation for a 1-D reference earth model, augmented by path corrections to capture 3-D variations in surface wave phase speeds, and to account for crustal structure. While this methodology remains essentially unchanged for the ongoing GCMT catalogue, source inversions based on waveform modelling in low-resolution 3-D earth models have revealed small but persistent biases in the standard modelling approach. Keeping pace with the increased capacity and demands of global tomography requires a revised catalogue of centroid-moment tensors (CMT), automatically and reproducibly computed using Green's functions from a state-of-the-art 3-D earth model. In this paper, we modify the current procedure for the full-waveform inversion of seismic traces for the six moment-tensor parameters, centroid latitude, longitude, depth and centroid time of global earthquakes. We take the GCMT solutions as a point of departure but update them to account for the effects of a heterogeneous earth, using the global 3-Dmore »wave speed model GLAD-M25. We generate synthetic seismograms from Green's functions computed by the spectral-element method in the 3-D model, select observed seismic data and remove their instrument response, process synthetic and observed data, select segments of observed and synthetic data based on similarity, and invert for new model parameters of the earthquake’s centroid location, time and moment tensor. The events in our new, preliminary database containing 9382 global event solutions, called CMT3D for ‘3-D centroid-moment tensors’, are on average 4 km shallower, about 1 s earlier, about 5 per cent larger in scalar moment, and more double-couple in nature than in the GCMT catalogue. We discuss in detail the geographical and statistical distributions of the updated solutions, and place them in the context of earlier work. We plan to disseminate our CMT3D solutions via the online ShakeMovie platform.

    « less
  4. We generalize the spatial and subset scan statistics from the single to the multiple subset case. The two main approaches to defining the log-likelihood ratio statistic in the single subset case—the population-based and expectation-based scan statistics—are considered, leading to risk partitioning and multiple cluster detection scan statistics, respectively. We show that, for distributions in a separable exponential family, the risk partitioning scan statistic can be expressed as a scaled f-divergence of the normalized count and baseline vectors, and the multiple cluster detection scan statistic as a sum of scaled Bregman divergences. In either case, however, maximization of the scan statistic by exhaustive search over all partitionings of the data requires exponential time. To make this optimization computationally feasible, we prove sufficient conditions under which the optimal partitioning is guaranteed to be consecutive. This Consecutive Partitions Property generalizes the linear-time subset scanning property from two partitions (the detected subset and the remaining data elements) to the multiple partition case. While the number of consecutive partitionings of n elements into t partitions scales as O(n^(t−1)), making it computationally expensive for large t, we present a dynamic programming approach which identifies the optimal consecutive partitioning in O(n^2 t) time, thus allowing for themore »exact and efficient solution of large-scale risk partitioning and multiple cluster detection problems. Finally, we demonstrate the detection performance and practical utility of partition scan statistics using simulated and real-world data. Supplementary materials for this article are available online.« less
  5. Abstract Higher-order tensors can represent scores in a rating system, frames in a video, and images of the same subject. In practice, the measurements are often highly quantized due to the sampling strategies or the quality of devices. Existing works on tensor recovery have focused on data losses and random noises. Only a few works consider tensor recovery from quantized measurements but are restricted to binary measurements. This paper, for the first time, addresses the problem of tensor recovery from multi-level quantized measurements by leveraging the low CANDECOMP/PARAFAC (CP) rank property. We study the recovery of both general low-rank tensors and tensors that have tensor singular value decomposition (TSVD) by solving nonconvex optimization problems. We provide the theoretical upper bounds of the recovery error, which diminish to zero when the sizes of dimensions increase to infinity. We further characterize the fundamental limit of any recovery algorithm and show that our recovery error is nearly order-wise optimal. A tensor-based alternating proximal gradient descent algorithm with a convergence guarantee and a TSVD-based projected gradient descent algorithm are proposed to solve the nonconvex problems. Our recovery methods can also handle data losses and do not necessarily need the information of the quantization rule.more »The methods are validated on synthetic data, image datasets, and music recommender datasets.« less