skip to main content


Title: Rapid detection of hot-spots via tensor decomposition with applications to crime rate data
In many real-world applications of monitoring multivariate spatio-temporal data that are non-stationary over time, one is often interested in detecting hot-spots with spatial sparsity and temporal consistency, instead of detecting system-wise changes as in traditional statistical process control (SPC) literature. In this paper, we propose an efficient method to detect hot-spots through tensor decomposition, and our method has three steps. First, we fit the observed data into a Smooth Sparse Decomposition Tensor (SSD-Tensor) model that serves as a dimension reduction and de-noising technique: it is an additive model decomposing the original data into: smooth but non-stationary global mean, sparse local anomalies, and random noises. Next, we estimate model parameters by the penalized framework that includes Least Absolute Shrinkage and Selection Operator (LASSO) and fused LASSO penalty. An efficient recursive optimization algorithm is developed based on Fast Iterative Shrinkage Thresholding Algorithm (FISTA). Finally, we apply a Cumulative Sum (CUSUM) Control Chart to monitor model residuals after removing global means, which helps to detect when and where hot-spots occur. To demonstrate the usefulness of our proposed SSD-Tensor method, we compare it with several other methods including scan statistics, LASSO-based, PCA-based, T2-based control chart in extensive numerical simulation studies and a real crime rate dataset.  more » « less
Award ID(s):
1830363 1830344 2015405 1830372
NSF-PAR ID:
10291553
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Journal of Applied Statistics
ISSN:
0266-4763
Page Range / eLocation ID:
1 to 27
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In many bio-surveillance and healthcare applications, data sources are measured from many spatial locations repeatedly over time, say, daily/weekly/monthly. In these applications, we are typically interested in detecting hot-spots, which are defined as some structured outliers that are sparse over the spatial domain but persistent over time. In this paper, we propose a tensor decomposition method to detect when and where the hot-spots occur. Our proposed methods represent the observed raw data as a three-dimensional tensor including a circular time dimension for daily/weekly/monthly patterns, and then decompose the tensor into three components: smooth global trend, local hot-spots, and residuals. A combination of LASSO and fused LASSO is used to estimate the model parameters, and a CUSUM procedure is applied to detect when and where the hot-spots might occur. The usefulness of our proposed methodology is validated through numerical simulation and a real-world dataset in the weekly number of gonorrhea cases from 2006 to 2018 for 50 states in the United States. 
    more » « less
  2. Graphs have been commonly used to represent complex data structures. In models dealing with graph-structured data, multivariate parameters may not only exhibit sparse patterns but have structured sparsity and smoothness in the sense that both zero and non-zero parameters tend to cluster together. We propose a new prior for high-dimensional parameters with graphical relations, referred to as the Tree-based Low-rank Horseshoe (T-LoHo) model, that generalizes the popular univariate Bayesian horseshoe shrinkage prior to the multivariate setting to detect structured sparsity and smoothness simultaneously. The T-LoHo prior can be embedded in many high-dimensional hierarchical models. To illustrate its utility, we apply it to regularize a Bayesian high-dimensional regression problem where the regression coefficients are linked by a graph, so that the resulting clusters have flexible shapes and satisfy the cluster contiguity constraint with respect to the graph. We design an efficient Markov chain Monte Carlo algorithm that delivers full Bayesian inference with uncertainty measures for model parameters such as the number of clusters. We offer theoretical investigations of the clustering effects and posterior concentration results. Finally, we illustrate the performance of the model with simulation studies and a real data application for anomaly detection on a road network. The results indicate substantial improvements over other competing methods such as the sparse fused lasso. 
    more » « less
  3. Event detection is gaining increasing attention in smart cities research. Large-scale mobility data serves as an important tool to uncover the dynamics of urban transportation systems, and more often than not the dataset is incomplete. In this article, we develop a method to detect extreme events in large traffic datasets, and to impute missing data during regular conditions. Specifically, we propose a robust tensor recovery problem to recover low-rank tensors under fiber-sparse corruptions with partial observations, and use it to identify events, and impute missing data under typical conditions. Our approach is scalable to large urban areas, taking full advantage of the spatio-temporal correlations in traffic patterns. We develop an efficient algorithm to solve the tensor recovery problem based on the alternating direction method of multipliers (ADMM) framework. Compared with existing l 1 norm regularized tensor decomposition methods, our algorithm can exactly recover the values of uncorrupted fibers of a low-rank tensor and find the positions of corrupted fibers under mild conditions. Numerical experiments illustrate that our algorithm can achieve exact recovery and outlier detection even with missing data rates as high as 40% under 5% gross corruption, depending on the tensor size and the Tucker rank of the low rank tensor. Finally, we apply our method on a real traffic dataset corresponding to downtown Nashville, TN and successfully detect the events like severe car crashes, construction lane closures, and other large events that cause significant traffic disruptions. 
    more » « less
  4. The use of video-imaging data for in-line process monitoring applications has become popular in industry. In this framework, spatio-temporal statistical process monitoring methods are needed to capture the relevant information content and signal possible out-of-control states. Video-imaging data are characterized by a spatio-temporal variability structure that depends on the underlying phenomenon, and typical out-of-control patterns are related to events that are localized both in time and space. In this article, we propose an integrated spatio-temporal decomposition and regression approach for anomaly detection in video-imaging data. Out-of-control events are typically sparse, spatially clustered and temporally consistent. The goal is not only to detect the anomaly as quickly as possible (“when”) but also to locate it in space (“where”). The proposed approach works by decomposing the original spatio-temporal data into random natural events, sparse spatially clustered and temporally consistent anomalous events, and random noise. Recursive estimation procedures for spatio-temporal regression are presented to enable the real-time implementation of the proposed methodology. Finally, a likelihood ratio test procedure is proposed to detect when and where the anomaly happens. The proposed approach was applied to the analysis of high-sped video-imaging data to detect and locate local hot-spots during a metal additive manufacturing process. 
    more » « less
  5. Table of Contents: Foreword by the CI 2016 Workshop Chairs …………………………………vi Foreword by the CI 2016 Steering Committee ..…………………………..…..viii List of Organizing Committee ………………………….……....x List of Registered Participants .………………………….……..xi Acknowledgement of Sponsors ……………………………..…xiv Hackathon and Workshop Agenda .………………………………..xv Hackathon Summary .………………………….…..xviii Invited talks - abstracts and links to presentations ………………………………..xxi Proceedings: 34 short research papers ……………………………….. 1-135 Papers 1. BAYESIAN MODELS FOR CLIMATE RECONSTRUCTION FROM POLLEN RECORDS ..................................... 1 Lasse Holmström, Liisa Ilvonen, Heikki Seppä, Siim Veski 2. ON INFORMATION CRITERIA FOR DYNAMIC SPATIO-TEMPORAL CLUSTERING ..................................... 5 Ethan D. Schaeffer, Jeremy M. Testa, Yulia R. Gel, Vyacheslav Lyubchich 3. DETECTING MULTIVARIATE BIOSPHERE EXTREMES ..................................... 9 Yanira Guanche García, Erik Rodner, Milan Flach, Sebastian Sippel, Miguel Mahecha, Joachim Denzler 4. SPATIO-TEMPORAL GENERATIVE MODELS FOR RAINFALL OVER INDIA ..................................... 13 Adway Mitra 5. A NONPARAMETRIC COPULA BASED BIAS CORRECTION METHOD FOR STATISTICAL DOWNSCALING ..................................... 17 Yi Li, Adam Ding, Jennifer Dy 6. DETECTING AND PREDICTING BEAUTIFUL SUNSETS USING SOCIAL MEDIA DATA ..................................... 21 Emma Pierson 7. OCEANTEA: EXPLORING OCEAN-DERIVED CLIMATE DATA USING MICROSERVICES ..................................... 25 Arne N. Johanson, Sascha Flögel, Wolf-Christian Dullo, Wilhelm Hasselbring 8. IMPROVED ANALYSIS OF EARTH SYSTEM MODELS AND OBSERVATIONS USING SIMPLE CLIMATE MODELS ..................................... 29 Balu Nadiga, Nathan Urban 9. SYNERGY AND ANALOGY BETWEEN 15 YEARS OF MICROWAVE SST AND ALONG-TRACK SSH ..................................... 33 Pierre Tandeo, Aitor Atencia, Cristina Gonzalez-Haro 10. PREDICTING EXECUTION TIME OF CLIMATE-DRIVEN ECOLOGICAL FORECASTING MODELS ..................................... 37 Scott Farley and John W. Williams 11. SPATIOTEMPORAL ANALYSIS OF SEASONAL PRECIPITATION OVER US USING CO-CLUSTERING ..................................... 41 Mohammad Gorji–Sefidmazgi, Clayton T. Morrison 12. PREDICTION OF EXTREME RAINFALL USING HYBRID CONVOLUTIONAL-LONG SHORT TERM MEMORY NETWORKS ..................................... 45 Sulagna Gope, Sudeshna Sarkar, Pabitra Mitra 13. SPATIOTEMPORAL PATTERN EXTRACTION WITH DATA-DRIVEN KOOPMAN OPERATORS FOR CONVECTIVELY COUPLED EQUATORIAL WAVES ..................................... 49 Joanna Slawinska, Dimitrios Giannakis 14. COVARIANCE STRUCTURE ANALYSIS OF CLIMATE MODEL OUTPUT ..................................... 53 Chintan Dalal, Doug Nychka, Claudia Tebaldi 15. SIMPLE AND EFFICIENT TENSOR REGRESSION FOR SPATIOTEMPORAL FORECASTING ..................................... 57 Rose Yu, Yan Liu 16. TRACKING OF TROPICAL INTRASEASONAL CONVECTIVE ANOMALIES ..................................... 61 Bohar Singh, James L. Kinter 17. ANALYSIS OF AMAZON DROUGHTS USING SUPERVISED KERNEL PRINCIPAL COMPONENT ANALYSIS ..................................... 65 Carlos H. R. Lima, Amir AghaKouchak 18. A BAYESIAN PREDICTIVE ANALYSIS OF DAILY PRECIPITATION DATA ..................................... 69 Sai K. Popuri, Nagaraj K. Neerchal, Amita Mehta 19. INCORPORATING PRIOR KNOWLEDGE IN SPATIO-TEMPORAL NEURAL NETWORK FOR CLIMATIC DATA ..................................... 73 Arthur Pajot, Ali Ziat, Ludovic Denoyer, Patrick Gallinari 20. DIMENSIONALITY-REDUCTION OF CLIMATE DATA USING DEEP AUTOENCODERS ..................................... 77 Juan A. Saenz, Nicholas Lubbers, Nathan M. Urban 21. MAPPING PLANTATION IN INDONESIA ..................................... 81 Xiaowei Jia, Ankush Khandelwal, James Gerber, Kimberly Carlson, Paul West, Vipin Kumar 22. FROM CLIMATE DATA TO A WEIGHTED NETWORK BETWEEN FUNCTIONAL DOMAINS ..................................... 85 Ilias Fountalis, Annalisa Bracco, Bistra Dilkina, Constantine Dovrolis 23. EMPLOYING SOFTWARE ENGINEERING PRINCIPLES TO ENHANCE MANAGEMENT OF CLIMATOLOGICAL DATASETS FOR CORAL REEF ANALYSIS ..................................... 89 Mark Jenne, M.M. Dalkilic, Claudia Johnson 24. Profiler Guided Manual Optimization for Accelerating Cholesky Decomposition on R Environment ..................................... 93 V.B. Ramakrishnaiah, R.P. Kumar, J. Paige, D. Hammerling, D. Nychka 25. GLOBAL MONITORING OF SURFACE WATER EXTENT DYNAMICS USING SATELLITE DATA ..................................... 97 Anuj Karpatne, Ankush Khandelwal and Vipin Kumar 26. TOWARD QUANTIFYING TROPICAL CYCLONE RISK USING DIAGNOSTIC INDICES .................................... 101 Erica M. Staehling and Ryan E. Truchelut 27. OPTIMAL TROPICAL CYCLONE INTENSITY ESTIMATES WITH UNCERTAINTY FROM BEST TRACK DATA .................................... 105 Suz Tolwinski-Ward 28. EXTREME WEATHER PATTERN DETECTION USING DEEP CONVOLUTIONAL NEURAL NETWORK .................................... 109 Yunjie Liu, Evan Racah, Prabhat, Amir Khosrowshahi, David Lavers, Kenneth Kunkel, Michael Wehner, William Collins 29. INFORMATION TRANSFER ACROSS TEMPORAL SCALES IN ATMOSPHERIC DYNAMICS .................................... 113 Nikola Jajcay and Milan Paluš 30. Identifying precipitation regimes in China using model-based clustering of spatial functional data .................................... 117 Haozhe Zhang, Zhengyuan Zhu, Shuiqing Yin 31. RELATIONAL RECURRENT NEURAL NETWORKS FOR SPATIOTEMPORAL INTERPOLATION FROM MULTI-RESOLUTION CLIMATE DATA .................................... 121 Guangyu Li, Yan Liu 32. OBJECTIVE SELECTION OF ENSEMBLE BOUNDARY CONDITIONS FOR CLIMATE DOWNSCALING .................................... 124 Andrew Rhines, Naomi Goldenson 33. LONG-LEAD PREDICTION OF EXTREME PRECIPITATION CLUSTER VIA A SPATIO-TEMPORAL CONVOLUTIONAL NEURAL NETWORK .................................... 128 Yong Zhuang, Wei Ding 34. MULTIPLE INSTANCE LEARNING FOR BURNED AREA MAPPING USING MULTI –TEMPORAL REFLECTANCE DATA .................................... 132 Guruprasad Nayak, Varun Mithal, Vipin Kumar 
    more » « less