skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Robust estimation and inference for expected shortfall regression with many regressors
Abstract Expected shortfall (ES), also known as superquantile or conditional value-at-risk, is an important measure in risk analysis and stochastic optimisation and has applications beyond these fields. In finance, it refers to the conditional expected return of an asset given that the return is below some quantile of its distribution. In this paper, we consider a joint regression framework recently proposed to model the quantile and ES of a response variable simultaneously, given a set of covariates. The current state-of-the-art approach to this problem involves minimising a non-differentiable and non-convex joint loss function, which poses numerical challenges and limits its applicability to large-scale data. Motivated by the idea of using Neyman-orthogonal scores to reduce sensitivity to nuisance parameters, we propose a statistically robust and computationally efficient two-step procedure for fitting joint quantile and ES regression models that can handle highly skewed and heavy-tailed data. We establish explicit non-asymptotic bounds on estimation and Gaussian approximation errors that lay the foundation for statistical inference, even with increasing covariate dimensions. Finally, through numerical experiments and two data applications, we demonstrate that our approach well balances robustness, statistical, and numerical efficiencies for expected shortfall regression.  more » « less
Award ID(s):
2238428 2113346 2401268 2113409
PAR ID:
10505209
Author(s) / Creator(s):
; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Journal of the Royal Statistical Society Series B: Statistical Methodology
Volume:
85
Issue:
4
ISSN:
1369-7412
Page Range / eLocation ID:
1223 to 1246
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Fan, Yanan; Nott, David; Smith, Michael S; Dortet-Bernadet, Jean-Luc. (Ed.)
    Quantile regression is widely seen as an ideal tool to understand complex predictor-response relations. Its biggest promise rests in its ability to quantify whether and how predictor effects vary across response quantile levels. But this promise has not been fully met due to a lack of statistical estimation methods that perform a rigorous, joint analysis of all quantile levels. This gap has been recently bridged by Yang and Tokdar [18]. Here we demonstrate how their joint quantile regression method, as encoded in the R package qrjoint, offers a comprehensive and model-based regression analysis framework. This chapter is an R vignette where we illustrate how to fit models, interpret coefficients, improve and compare models and obtain predictions under this framework. Our case study is an application to ecology where we analyse how the abundance of red maple trees depends on topographical and geographical features of the location. A complete absence of the species contributes excess zeros in the response data. We treat such excess zeros as left censoring in the spirit of a Tobit regression analysis. By utilising the generative nature of the joint quantile regression model, we not only adjust for censoring but also treat it as an object of independent scientific interest. 
    more » « less
  2. Quantile regression has become a widely used tool for analysing competing risk data. However, quantile regression for competing risk data with a continuous mark is still scarce. The mark variable is an extension of cause of failure in a classical competing risk model where cause of failure is replaced by a continuous mark only observed at uncensored failure times. An example of the continuous mark variable is the genetic distance that measures dissimilarity between the infecting virus and the virus contained in the vaccine construct. In this article, we propose a novel mark-specific quantile regression model. The proposed estimation method borrows strength from data in a neighbourhood of a mark and is based on an induced smoothed estimation equation, which is very different from the existing methods for competing risk data with discrete causes. The asymptotic properties of the resulting estimators are established across mark and quantile continuums. In addition, a mark-specific quantile-type vaccine efficacy is proposed and its statistical inference procedures are developed. Simulation studies are conducted to evaluate the finite sample performances of the proposed estimation and hypothesis testing procedures. An application to the first HIV vaccine efficacy trial is provided. 
    more » « less
  3. The conditional average treatment effect (CATE) is the best measure of individual causal effects given baseline covariates. However, the CATE only captures the (conditional) average, and can overlook risks and tail events, which are important to treatment choice. In aggregate analyses, this is usually addressed by measuring the distributional treatment effect (DTE), such as differences in quantiles or tail expectations between treatment groups. Hypothetically, one can similarly fit conditional quantile regressions in each treatment group and take their difference, but this would not be robust to misspecification or provide agnostic best-in-class predictions. We provide a new robust and model-agnostic methodology for learning the conditional DTE (CDTE) for a class of problems that includes conditional quantile treatment effects, conditional super-quantile treatment effects, and conditional treatment effects on coherent risk measures given by f-divergences. Our method is based on constructing a special pseudo-outcome and regressing it on covariates using any regression learner. Our method is model-agnostic in that it can provide the best projection of CDTE onto the regression model class. Our method is robust in that even if we learn these nuisances nonparametrically at very slow rates, we can still learn CDTEs at rates that depend on the class complexity and even conduct inferences on linear projections of CDTEs. We investigate the behavior of our proposal in simulations, as well as in a case study of 401(k) eligibility effects on wealth. 
    more » « less
  4. Chiappa, Silvia; Calandra, Roberto (Ed.)
    Random forests are powerful non-parametric regression method but are severely limited in their usage in the presence of randomly censored observations, and naively applied can exhibit poor predictive performance due to the incurred biases. Based on a local adaptive representation of random forests, we develop its regression adjustment for randomly censored regression quantile models. Regression adjustment is based on a new estimating equation that adapts to censoring and leads to quantile score whenever the data do not exhibit censoring. The proposed procedure named censored quantile regression forest, allows us to estimate quantiles of time-to-event without any parametric modeling assumption. We establish its consistency under mild model specifications. Numerical studies showcase a clear advantage of the proposed procedure. 
    more » « less
  5. null (Ed.)
    Summary Quantile regression is a popular and powerful method for studying the effect of regressors on quantiles of a response distribution. However, existing results on quantile regression were mainly developed for cases in which the quantile level is fixed, and the data are often assumed to be independent. Motivated by recent applications, we consider the situation where (i) the quantile level is not fixed and can grow with the sample size to capture the tail phenomena, and (ii) the data are no longer independent, but collected as a time series that can exhibit serial dependence in both tail and non-tail regions. To study the asymptotic theory for high-quantile regression estimators in the time series setting, we introduce a tail adversarial stability condition, which had not previously been described, and show that it leads to an interpretable and convenient framework for obtaining limit theorems for time series that exhibit serial dependence in the tail region, but are not necessarily strongly mixing. Numerical experiments are conducted to illustrate the effect of tail dependence on high-quantile regression estimators, for which simply ignoring the tail dependence may yield misleading $$p$$-values. 
    more » « less