skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: High-quantile regression for tail-dependent time series
Summary Quantile regression is a popular and powerful method for studying the effect of regressors on quantiles of a response distribution. However, existing results on quantile regression were mainly developed for cases in which the quantile level is fixed, and the data are often assumed to be independent. Motivated by recent applications, we consider the situation where (i) the quantile level is not fixed and can grow with the sample size to capture the tail phenomena, and (ii) the data are no longer independent, but collected as a time series that can exhibit serial dependence in both tail and non-tail regions. To study the asymptotic theory for high-quantile regression estimators in the time series setting, we introduce a tail adversarial stability condition, which had not previously been described, and show that it leads to an interpretable and convenient framework for obtaining limit theorems for time series that exhibit serial dependence in the tail region, but are not necessarily strongly mixing. Numerical experiments are conducted to illustrate the effect of tail dependence on high-quantile regression estimators, for which simply ignoring the tail dependence may yield misleading $$p$$-values.  more » « less
Award ID(s):
1848035 2131821
PAR ID:
10219887
Author(s) / Creator(s):
Date Published:
Journal Name:
Biometrika
Volume:
108
Issue:
1
ISSN:
0006-3444
Page Range / eLocation ID:
113 to 126
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Summary In this article we develop an asymptotic theory for sample tail autocorrelations of time series data that can exhibit serial dependence in both tail and non-tail regions. Unlike with the traditional autocorrelation function, the study of tail autocorrelations requires a double asymptotic scheme to capture the tail phenomena, and our results do not impose any restrictions on the dependence structure in non-tail regions and allow processes that are not necessarily strongly mixing. The newly developed asymptotic theory reveals a previously undiscovered phase transition phenomenon, where the asymptotic behaviour of sample tail autocorrelations, including their convergence rate, can transition from one phase to another as the lag index moves past the point beyond which serial tail dependence vanishes. The phase transition discovery fills a gap in existing research on tail autocorrelations and can be used to construct the lines of significance, in analogy to the traditional autocorrelation plot, when visualizing sample tail autocorrelations to assess the existence of serial tail dependence or to identify the maximal lag of tail dependence. 
    more » « less
  2. For stationary time series with regularly varying marginal distributions, an important problem is to estimate the associated tail index which characterizes the power‐law behavior of the tail distribution. For this, various results have been developed for independent data and certain types of dependent data. In this article, we consider the problem of tail index estimation under a recently proposed notion of serial tail dependence called the tail adversarial stability. Using the technique of adversarial innovation coupling and a martingale approximation scheme, we establish the consistency and central limit theorem of the tail index estimator for a general class of tail dependent time series. Based on the asymptotic normal distribution from the obtained central limit theorem, we further consider an application to cluster a large number of regularly varying time series based on their tail indices by using a robust mixture algorithm. The results are illustrated using numerical examples including Monte Carlo simulations and a real data analysis. 
    more » « less
  3. Abstract Linear quantile regression is a powerful tool to investigate how predictors may affect a response heterogeneously across different quantile levels. Unfortunately, existing approaches find it extremely difficult to adjust for any dependency between observation units, largely because such methods are not based upon a fully generative model of the data. For analysing spatially indexed data, we address this difficulty by generalizing the joint quantile regression model of Yang and Tokdar (Journal of the American Statistical Association, 2017, 112(519), 1107–1120) and characterizing spatial dependence via a Gaussian or t-copula process on the underlying quantile levels of the observation units. A Bayesian semiparametric approach is introduced to perform inference of model parameters and carry out spatial quantile smoothing. An effective model comparison criteria is provided, particularly for selecting between different model specifications of tail heaviness and tail dependence. Extensive simulation studies and two real applications to particulate matter concentration and wildfire risk are presented to illustrate substantial gains in inference quality, prediction accuracy and uncertainty quantification over existing alternatives. 
    more » « less
  4. Steel, Mark (Ed.)
    Bayesian cross-validation (CV) is a popular method for predictive model assessment that is simple to implement and broadly applicable. A wide range of CV schemes is available for time series applications, including generic leave-one-out (LOO) and K-fold methods, as well as specialized approaches intended to deal with serial dependence such as leave-future-out (LFO), h-block, and hv-block. Existing large-sample results show that both specialized and generic methods are applicable to models of serially-dependent data. However, large sample consistency results overlook the impact of sampling variability on accuracy in finite samples. Moreover, the accuracy of a CV scheme depends on many aspects of the procedure. We show that poor design choices can lead to elevated rates of adverse selection. In this paper, we consider the problem of identifying the regression component of an important class of models of data with serial dependence, autoregressions of order p with q exogenous regressors (ARX(p,q)), under the logarithmic scoring rule. We show that when serial dependence is present, scores computed using the joint (multivariate) density have lower variance and better model selection accuracy than the popular pointwise estimator. In addition, we present a detailed case study of the special case of ARX models with fixed autoregressive structure and variance. For this class, we derive the finite-sample distribution of the CV estimators and the model selection statistic. We conclude with recommendations for practitioners. 
    more » « less
  5. null (Ed.)
    RNA sequencing data have been abundantly generated in biomedical research for biomarker discovery and other studies. Such data at the exon level are usually heavily tailed and correlated. Conventional statistical tests based on the mean or median difference for differential expression likely suffer from low power when the between-group difference occurs mostly in the upper or lower tail of the distribution of gene expression. We propose a tail-based test to make comparisons between groups in terms of a specific distribution area rather than a single location. The proposed test, which is derived from quantile regression, adjusts for covariates and accounts for within-sample dependence among the exons through a specified correlation structure. Through Monte Carlo simulation studies, we show that the proposed test is generally more powerful and robust in detecting differential expression than commonly used tests based on the mean or a single quantile. An application to TCGA lung adenocarcinoma data demonstrates the promise of the proposed method in terms of biomarker discovery. 
    more » « less