Abstract We introduce a novel geometry‐oriented methodology, based on the emerging tools of topological data analysis, into the change‐point detection framework. The key rationale is that change points are likely to be associated with changes in geometry behind the data‐generating process. While the applications of topological data analysis to change‐point detection are potentially very broad, in this paper, we primarily focus on integrating topological concepts with the existing nonparametric methods for change‐point detection. In particular, the proposed new geometry‐oriented approach aims to enhance detection accuracy of distributional regime shift locations. Our simulation studies suggest that integration of topological data analysis with some existing algorithms for change‐point detection leads to consistently more accurate detection results. We illustrate our new methodology in application to the two closely related environmental time series data sets—ice phenology of the Lake Baikal and the North Atlantic Oscillation indices, in a research query for a possible association between their estimated regime shift locations.
more »
« less
Segmenting Time Series via Self-Normalisation
Abstract We propose a novel and unified framework for change-point estimation in multivariate time series. The proposed method is fully non-parametric, robust to temporal dependence and avoids the demanding consistent estimation of long-run variance. One salient and distinct feature of the proposed method is its versatility, where it allows change-point detection for a broad class of parameters (such as mean, variance, correlation and quantile) in a unified fashion. At the core of our method, we couple the self-normalisation- (SN) based tests with a novel nested local-window segmentation algorithm, which seems new in the growing literature of change-point analysis. Due to the presence of an inconsistent long-run variance estimator in the SN test, non-standard theoretical arguments are further developed to derive the consistency and convergence rate of the proposed SN-based change-point detection method. Extensive numerical experiments and relevant real data analysis are conducted to illustrate the effectiveness and broad applicability of our proposed method in comparison with state-of-the-art approaches in the literature.
more »
« less
- Award ID(s):
- 2014053
- PAR ID:
- 10444361
- Date Published:
- Journal Name:
- Journal of the Royal Statistical Society Series B: Statistical Methodology
- Volume:
- 84
- Issue:
- 5
- ISSN:
- 1369-7412
- Page Range / eLocation ID:
- 1699 to 1725
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Accurate estimation of localized occupancy related information in real time enables a broad range of intelligent smart environment applications. A large number of studies using heterogeneous sensor arrays reflect the myriad requirements of various emerging pervasive, ubiquitous and participatory sensing applications. In this paper, we introduce a zero-configuration and infrastructure-less smartphone based location specific occupancy estimation model. We opportunistically exploit smartphone’s acoustic sensors in a conversing environment and motion sensors in absence of any conversational data. We demonstrate a novel speaker estimation algorithm based on unsupervised clustering of overlapped and non-overlapped conversational data and a change point detection algorithm for locomotive motion of the users to infer the occupancy. We augment our occupancy detection model with a fingerprinting based methodology using smartphone’s magnetometer sensor to accurately assimilate location information of any gathering. We postulate a novel crowdsourcing-based approach to annotate the semantic location of the occupancy. We evaluate our algorithms in different contexts; conversational, silence and mixed in presence of 10 domestic users. Our experimental results on real-life data traces in natural settings show that using this hybrid approach, we can achieve approximately 0.76 error count distance for occupancy detection accuracy on average.more » « less
-
Detecting when the underlying distribution changes for the observed time series is a fundamental problem arising in a broad spectrum of applications. In this paper, we study multiple change-point localization in the high-dimensional regression setting, which is particularly challenging as no direct observations of the parameter of interest is available. Specifically, we assume we observe {xt,yt}nt=1 where {xt}nt=1 are p-dimensional covariates, {yt}nt=1 are the univariate responses satisfying 𝔼(yt)=x⊤tβ∗t for 1≤t≤n and {β∗t}nt=1 are the unobserved regression coefficients that change over time in a piecewise constant manner. We propose a novel projection-based algorithm, Variance Projected Wild Binary Segmentation~(VPWBS), which transforms the original (difficult) problem of change-point detection in p-dimensional regression to a simpler problem of change-point detection in mean of a one-dimensional time series. VPWBS is shown to achieve sharp localization rate Op(1/n) up to a log factor, a significant improvement from the best rate Op(1/n‾√) known in the existing literature for multiple change-point localization in high-dimensional regression. Extensive numerical experiments are conducted to demonstrate the robust and favorable performance of VPWBS over two state-of-the-art algorithms, especially when the size of change in the regression coefficients {β∗t}nt=1 is small.more » « less
-
Abstract We propose a piecewise linear quantile trend model to analyse the trajectory of the COVID-19 daily new cases (i.e. the infection curve) simultaneously across multiple quantiles. The model is intuitive, interpretable and naturally captures the phase transitions of the epidemic growth rate via change-points. Unlike the mean trend model and least squares estimation, our quantile-based approach is robust to outliers, captures heteroscedasticity (commonly exhibited by COVID-19 infection curves) and automatically delivers both point and interval forecasts with minimal assumptions. Building on a self-normalized (SN) test statistic, this paper proposes a novel segmentation algorithm for multiple change-point estimation. Theoretical guarantees such as segmentation consistency are established under mild and verifiable assumptions. Using the proposed method, we analyse the COVID-19 infection curves in 35 major countries and discover patterns with potentially relevant implications for effectiveness of the pandemic responses by different countries. A simple change-adaptive two-stage forecasting scheme is further designed to generate short-term prediction of COVID-19 cumulative new cases and is shown to deliver accurate forecast valuable to public health decision-making.more » « less
-
null (Ed.)Non-stationary bandits and clustered bandits lift the restrictive assumptions in contextual bandits and provide solutions to many important real-world scenarios. Though they have been studied independently so far, we point out the essence in solving these two problems overlaps considerably. In this work, we connect these two strands of bandit research under the notion of test of homogeneity, which seamlessly addresses change detection for non-stationary bandit and cluster identification for clustered bandit in a unified solution framework. Rigorous regret analysis and extensive empirical evaluations demonstrate the value of our proposed solution, especially its flexibility in handling various environment assumptions, e.g., a clustered non-stationary environment.more » « less
An official website of the United States government

