ABSTRACT This work considers estimation and forecasting in a multivariate, possibly high‐dimensional count time series model constructed from a transformation of a latent Gaussian dynamic factor series. The estimation of the latent model parameters is based on second‐order properties of the count and underlying Gaussian time series, yielding estimators of the underlying covariance matrices for which standard principal component analysis applies. Theoretical consistency results are established for the proposed estimation, building on certain concentration results for the models of the type considered. They also involve the memory of the latent Gaussian process, quantified through a spectral gap, shown to be suitably bounded as the model dimension increases, which is of independent interest. In addition, novel cross‐validation schemes are suggested for model selection. The forecasting is carried out through a particle‐based sequential Monte Carlo, leveraging Kalman filtering techniques. A simulation study and an application are also considered.
more »
« less
Functional principal component analysis with informative observation times
Summary Functional principal component analysis has been shown to be invaluable for revealing variation modes of longitudinal outcomes, which serve as important building blocks for forecasting and model building. Decades of research have advanced methods for functional principal component analysis, often assuming independence between the observation times and longitudinal outcomes. Yet such assumptions are fragile in real-world settings where observation times may be driven by outcome-related processes. Rather than ignoring the informative observation time process, we explicitly model the observational times by a general counting process dependent on time-varying prognostic factors. Identification of the mean, covariance function and functional principal components ensues via inverse intensity weighting. We propose using weighted penalized splines for estimation and establish consistency and convergence rates for the weighted estimators. Simulation studies demonstrate that the proposed estimators are substantially more accurate than the existing ones in the presence of a correlation between the observation time process and the longitudinal outcome process. We further examine the finite-sample performance of the proposed method using the Acute Infection and Early Disease Research Program study.
more »
« less
- Award ID(s):
- 2242776
- PAR ID:
- 10596407
- Publisher / Repository:
- Biometrika
- Date Published:
- Journal Name:
- Biometrika
- Volume:
- 112
- Issue:
- 1
- ISSN:
- 1464-3510
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Sparse functional/longitudinal data have attracted widespread interest due to the prevalence of such data in social and life sciences. A prominent scenario where such data are routinely encountered are accelerated longitudinal studies, where subjects are enrolled in the study at a random time and are only tracked for a short amount of time relative to the domain of interest. The statistical analysis of such functional snippets is challenging since information for far-off-diagonal regions of the covariance structure is missing. Our main methodological contribution is to address this challenge by bypassing covariance estimation and instead modelling the underlying process as the solution of a data-adaptive stochastic differential equation. Taking advantage of the interface between Gaussian functional data and stochastic differential equations makes it possible to efficiently reconstruct the target process by estimating its dynamic distribution. The proposed approach allows one to consistently recover forward sample paths from functional snippets at the subject level. We establish the existence and uniqueness of the solution to the proposed data-driven stochastic differential equation and derive rates of convergence for the corresponding estimators. The finite sample performance is demonstrated with simulation studies and functional snippets arising from a growth study and spinal bone mineral density data.more » « less
-
ABSTRACT Many trials are designed to collect outcomes at or around pre-specified times after randomization. If there is variability in the times when participants are actually assessed, this can pose a challenge to learning the effect of treatment, since not all participants have outcome assessments at the times of interest. Furthermore, observed outcome values may not be representative of all participants’ outcomes at a given time. Methods have been developed that account for some types of such irregular and informative assessment times; however, since these methods rely on untestable assumptions, sensitivity analyses are needed. We develop a sensitivity analysis methodology that is benchmarked at the explainable assessment (EA) assumption, under which assessment and outcomes at each time are related only through data collected prior to that time. Our method uses an exponential tilting assumption, governed by a sensitivity analysis parameter, that posits deviations from the EA assumption. Our inferential strategy is based on a new influence function-based, augmented inverse intensity-weighted estimator. Our approach allows for flexible semiparametric modeling of the observed data, which is separated from specification of the sensitivity parameter. We apply our method to a randomized trial of low-income individuals with uncontrolled asthma, and we illustrate implementation of our estimation procedure in detail.more » « less
-
null (Ed.)We propose a general framework of using a multi-level log-Gaussian Cox process to model repeatedly observed point processes with complex structures; such type of data has become increasingly available in various areas including medical research, social sciences, economics, and finance due to technological advances. A novel nonparametric approach is developed to efficiently and consistently estimate the covariance functions of the latent Gaussian processes at all levels. To predict the functional principal component scores, we propose a consistent estimation procedure by maximizing the conditional likelihood of super-positions of point processes. We further extend our procedure to the bivariate point process case in which potential correlations between the processes can be assessed. Asymptotic properties of the proposed estimators are investigated, and the effectiveness of our procedures is illustrated through a simulation study and an application to a stock trading dataset.more » « less
-
We propose a general framework of using a multi-level log-Gaussian Cox process to model repeatedly observed point processes with complex structures; such type of data have become increasingly available in various areas including medical research, social sciences, economics, and finance due to technological advances. A novel nonparametric approach is developed to efficiently and consistently estimate the covariance functions of the latent Gaussian processes at all levels. To predict the functional principal component scores, we propose a consistent estimation procedure by maximizing the conditional likelihood of super-positions of point processes. We further extend our procedure to the bivariate point process case in which potential correlations between the processes can be assessed. Asymptotic properties of the proposed estimators are investigated, and the effectiveness of our procedures is illustrated through a simulation study and an application to a stock trading dataset.more » « less
An official website of the United States government

