In many real-world applications of monitoring multivariate spatio-temporal data that are non-stationary over time, one is often interested in detecting hot-spots with spatial sparsity and temporal consistency, instead of detecting system-wise changes as in traditional statistical process control (SPC) literature. In this paper, we propose an efficient method to detect hot-spots through tensor decomposition, and our method has three steps. First, we fit the observed data into a Smooth Sparse Decomposition Tensor (SSD-Tensor) model that serves as a dimension reduction and de-noising technique: it is an additive model decomposing the original data into: smooth but non-stationary global mean, sparse local anomalies, and random noises. Next, we estimate model parameters by the penalized framework that includes Least Absolute Shrinkage and Selection Operator (LASSO) and fused LASSO penalty. An efficient recursive optimization algorithm is developed based on Fast Iterative Shrinkage Thresholding Algorithm (FISTA). Finally, we apply a Cumulative Sum (CUSUM) Control Chart to monitor model residuals after removing global means, which helps to detect when and where hot-spots occur. To demonstrate the usefulness of our proposed SSD-Tensor method, we compare it with several other methods including scan statistics, LASSO-based, PCA-based, T2-based control chart in extensive numerical simulation studies and a real crime rate dataset.
more »
« less
Rapid Detection of Hot-spot by Tensor Decomposition on Space and Circular Time with Application to Weekly Gonorrhea data
In many bio-surveillance and healthcare applications, data sources are measured from many spatial locations repeatedly over time, say, daily/weekly/monthly. In these applications, we are typically interested in detecting hot-spots, which are defined as some structured outliers that are sparse over the spatial domain but persistent over time. In this paper, we propose a tensor decomposition method to detect when and where the hot-spots occur. Our proposed methods represent the observed raw data as a three-dimensional tensor including a circular time dimension for daily/weekly/monthly patterns, and then decompose the tensor into three components: smooth global trend, local hot-spots, and residuals. A combination of LASSO and fused LASSO is used to estimate the model parameters, and a CUSUM procedure is applied to detect when and where the hot-spots might occur. The usefulness of our proposed methodology is validated through numerical simulation and a real-world dataset in the weekly number of gonorrhea cases from 2006 to 2018 for 50 states in the United States.
more »
« less
- Award ID(s):
- 1830372
- PAR ID:
- 10184526
- Date Published:
- Journal Name:
- THE XIIITH INTERNATIONAL WORKSHOP ON INTELLIGENT STATISTICAL QUALITY CONTROL 2019
- Page Range / eLocation ID:
- 289-309
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
The use of video-imaging data for in-line process monitoring applications has become popular in industry. In this framework, spatio-temporal statistical process monitoring methods are needed to capture the relevant information content and signal possible out-of-control states. Video-imaging data are characterized by a spatio-temporal variability structure that depends on the underlying phenomenon, and typical out-of-control patterns are related to events that are localized both in time and space. In this article, we propose an integrated spatio-temporal decomposition and regression approach for anomaly detection in video-imaging data. Out-of-control events are typically sparse, spatially clustered and temporally consistent. The goal is not only to detect the anomaly as quickly as possible (“when”) but also to locate it in space (“where”). The proposed approach works by decomposing the original spatio-temporal data into random natural events, sparse spatially clustered and temporally consistent anomalous events, and random noise. Recursive estimation procedures for spatio-temporal regression are presented to enable the real-time implementation of the proposed methodology. Finally, a likelihood ratio test procedure is proposed to detect when and where the anomaly happens. The proposed approach was applied to the analysis of high-sped video-imaging data to detect and locate local hot-spots during a metal additive manufacturing process.more » « less
-
Abstract Helicopters used for aerial wildlife surveys are expensive, dangerous and time consuming. Drones and thermal infrared cameras can detect wildlife, though the ability to detect individuals is dependent on weather conditions. While we have a good understanding of local weather conditions, we do not have a broad-scale assessment of ambient temperature to plan drone wildlife surveys. Climate change will affect our ability to conduct thermal surveys in the future. Our objective was to determine optimal annual and daily time periods to conduct surveys. We present a case study in Texas, (United States of America [USA]) where we acquired and compared average monthly temperature data from 1990 to 2019, hourly temperature data from 2010 to 2019 and projected monthly temperature data from 2021 to 2040 to identify areas where surveys would detect a commonly studied ungulate (white-tailed deer [ Odocoileus virginianus ]) during sunny or cloudy conditions. Mean temperatures increased when comparing the 1990–2019 to 2010–2019 periods. Mean temperatures above the maximum ambient temperature in which white-tailed deer can be detected increased in 72, 10, 10, and 24 of the 254 Texas counties in June, July, August, and September, respectively. Future climate projections indicate that temperatures above the maximum ambient temperature in which white-tailed deer can be detected will increase in 32, 12, 15, and 47 counties in June, July, August, and September, respectively when comparing 2010–2019 with 2021–2040. This analysis can assist planning, and scheduling thermal drone wildlife surveys across the year and combined with daily data can be efficient to plan drone flights.more » « less
-
Tracking the COVID-19 pandemic has been a major challenge for policy makers. Although, several efforts are ongoing for accurate forecasting of cases, deaths, and hospitalization at various resolutions, few have been attempted for college campuses despite their potential to become COVID-19 hot-spots. In this paper, we present a real-time effort towards weekly forecasting of campus-level cases during the fall semester for four universities in Virginia, United States. We discuss the challenges related to data curation. A causal model is employed for forecasting with one free time-varying parameter, calibrated against case data. The model is then run forward in time to obtain multiple forecasts. We retrospectively evaluate the performance and, while forecast quality suffers during the campus reopening phase, the model makes reasonable forecasts as the fall semester progresses. We provide sensitivity analysis for the several model parameters. In addition, the forecasts are provided weekly to various state and local agencies.more » « less
-
Abstract Understanding dynamic human mobility changes and spatial interaction patterns at different geographic scales is crucial for assessing the impacts of non-pharmaceutical interventions (such as stay-at-home orders) during the COVID-19 pandemic. In this data descriptor, we introduce a regularly-updated multiscale dynamic human mobility flow dataset across the United States, with data starting from March 1st, 2020. By analysing millions of anonymous mobile phone users’ visits to various places provided by SafeGraph, the daily and weekly dynamic origin-to-destination (O-D) population flows are computed, aggregated, and inferred at three geographic scales: census tract, county, and state. There is high correlation between our mobility flow dataset and openly available data sources, which shows the reliability of the produced data. Such a high spatiotemporal resolution human mobility flow dataset at different geographic scales over time may help monitor epidemic spreading dynamics, inform public health policy, and deepen our understanding of human behaviour changes under the unprecedented public health crisis. This up-to-date O-D flow open data can support many other social sensing and transportation applications.more » « less
An official website of the United States government

