ABSTRACT This article combines methods from existing techniques to identify multiple changepoints in non‐Gaussian autocorrelated time series. A transformation is used to convert a Gaussian series into a non‐Gaussian series, enabling penalized likelihood methods to handle non‐Gaussian scenarios. When the marginal distribution of the data is continuous, the methods essentially reduce to the change of variables formula for probability densities. When the marginal distribution is count‐oriented, Hermite expansions and particle filtering techniques are used to quantify the scenario. Simulations demonstrating the efficacy of the methods are given and two data sets are analyzed: 1) the proportion of home runs hit by Major League Baseball batters from 1920 to 2023 and 2) a six‐dimensional series of tropical cyclone counts from the Earth's basins of generation from 1980 to 2023. In the first series, beta marginal distributions are used to describe the proportions; in the second, Poisson marginal distributions seem appropriate.
more »
« less
Seasonal count time series
Count time series are widely encountered in practice. As with continuous valued data, many count series have seasonal properties. This article uses a recent advance in stationary count time series to develop a general seasonal count time series modeling paradigm. The model constructed here permits any marginal distribution for the series and the most flexible autocorrelations possible, including those with negative dependence. Likelihood methods of inference are explored. The article first develops the modeling methods, which entail a discrete transformation of a Gaussian process having seasonal dynamics. Properties of this model class are then established and particle filtering likelihood methods of parameter estimation are developed. A simulation study demonstrating the efficacy of the methods is presented and an application to the number of rainy days in successive weeks in Seattle, Washington is given.
more »
« less
- Award ID(s):
- 2113592
- PAR ID:
- 10385400
- Publisher / Repository:
- Wiley-Blackwell
- Date Published:
- Journal Name:
- Journal of Time Series Analysis
- Volume:
- 44
- Issue:
- 1
- ISSN:
- 0143-9782
- Page Range / eLocation ID:
- p. 93-124
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
This article provides an accessible introduction to the phenomenon of monotone likelihood in duration modeling of political events. Monotone likelihood arises when covariate values are monotonic when ordered according to failure time, causing parameter estimates to diverge toward infinity. Within political science duration model applications, this problem leads to misinterpretation, model misspecification and omitted variable biases, among other issues. Using a combination of mathematical exposition, Monte Carlo simulations and empirical applications, this article illustrates the advantages of Firth's penalized maximum-likelihood estimation in resolving the methodological complications underlying monotone likelihood. The results identify the conditions under which monotone likelihood is most acute and provide guidance for political scientists applying duration modeling techniques in their empirical research.more » « less
-
ABSTRACT This work considers estimation and forecasting in a multivariate, possibly high‐dimensional count time series model constructed from a transformation of a latent Gaussian dynamic factor series. The estimation of the latent model parameters is based on second‐order properties of the count and underlying Gaussian time series, yielding estimators of the underlying covariance matrices for which standard principal component analysis applies. Theoretical consistency results are established for the proposed estimation, building on certain concentration results for the models of the type considered. They also involve the memory of the latent Gaussian process, quantified through a spectral gap, shown to be suitably bounded as the model dimension increases, which is of independent interest. In addition, novel cross‐validation schemes are suggested for model selection. The forecasting is carried out through a particle‐based sequential Monte Carlo, leveraging Kalman filtering techniques. A simulation study and an application are also considered.more » « less
-
Abstract Most of the current public health surveillance methods used in epidemiological studies to identify hotspots of diseases assume that the regional disease case counts are independently distributed and they lack the ability of adjusting for confounding covariates. This article proposes a new approach that uses a simultaneous autoregressive (SAR) model, a popular spatial regression approach, within the classical space‐time cumulative sum (CUSUM) framework for detecting changes in the spatial distribution of count data while accounting for risk factors and spatial correlation. We develop expressions for the likelihood ratio test monitoring statistics based on a SAR model with covariates, leading to the proposed space‐time CUSUM test statistic. The effectiveness of the proposed monitoring approach in detecting and identifying step shifts is studied by simulation of various shift scenarios in regional counts. A case study for monitoring regional COVID‐19 infection counts while adjusting for social vulnerability, often correlated with a community's susceptibility towards disease infection, is presented to illustrate the application of the proposed methodology in public health surveillance.more » « less
-
Abstract In the form of multidimensional arrays, tensor data have become increasingly prevalent in modern scientific studies and biomedical applications such as computational biology, brain imaging analysis, and process monitoring system. These data are intrinsically heterogeneous with complex dependencies and structure. Therefore, ad‐hoc dimension reduction methods on tensor data may lack statistical efficiency and can obscure essential findings. Model‐based clustering is a cornerstone of multivariate statistics and unsupervised learning; however, existing methods and algorithms are not designed for tensor‐variate samples. In this article, we propose a tensor envelope mixture model (TEMM) for simultaneous clustering and multiway dimension reduction of tensor data. TEMM incorporates tensor‐structure‐preserving dimension reduction into mixture modeling and drastically reduces the number of free parameters and estimative variability. An expectation‐maximization‐type algorithm is developed to obtain likelihood‐based estimators of the cluster means and covariances, which are jointly parameterized and constrained onto a series of lower dimensional subspaces known as the tensor envelopes. We demonstrate the encouraging empirical performance of the proposed method in extensive simulation studies and a real data application in comparison with existing vector and tensor clustering methods.more » « less
An official website of the United States government
