Abstract Linear quantile regression is a powerful tool to investigate how predictors may affect a response heterogeneously across different quantile levels. Unfortunately, existing approaches find it extremely difficult to adjust for any dependency between observation units, largely because such methods are not based upon a fully generative model of the data. For analysing spatially indexed data, we address this difficulty by generalizing the joint quantile regression model of Yang and Tokdar (Journal of the American Statistical Association, 2017, 112(519), 1107–1120) and characterizing spatial dependence via a Gaussian or t-copula process on the underlying quantile levels of the observation units. A Bayesian semiparametric approach is introduced to perform inference of model parameters and carry out spatial quantile smoothing. An effective model comparison criteria is provided, particularly for selecting between different model specifications of tail heaviness and tail dependence. Extensive simulation studies and two real applications to particulate matter concentration and wildfire risk are presented to illustrate substantial gains in inference quality, prediction accuracy and uncertainty quantification over existing alternatives. 
                        more » 
                        « less   
                    
                            
                            Robust estimation and inference for expected shortfall regression with many regressors
                        
                    
    
            Abstract Expected shortfall (ES), also known as superquantile or conditional value-at-risk, is an important measure in risk analysis and stochastic optimisation and has applications beyond these fields. In finance, it refers to the conditional expected return of an asset given that the return is below some quantile of its distribution. In this paper, we consider a joint regression framework recently proposed to model the quantile and ES of a response variable simultaneously, given a set of covariates. The current state-of-the-art approach to this problem involves minimising a non-differentiable and non-convex joint loss function, which poses numerical challenges and limits its applicability to large-scale data. Motivated by the idea of using Neyman-orthogonal scores to reduce sensitivity to nuisance parameters, we propose a statistically robust and computationally efficient two-step procedure for fitting joint quantile and ES regression models that can handle highly skewed and heavy-tailed data. We establish explicit non-asymptotic bounds on estimation and Gaussian approximation errors that lay the foundation for statistical inference, even with increasing covariate dimensions. Finally, through numerical experiments and two data applications, we demonstrate that our approach well balances robustness, statistical, and numerical efficiencies for expected shortfall regression. 
        more » 
        « less   
        
    
    
                            - PAR ID:
- 10514831
- Publisher / Repository:
- Oxford University Press
- Date Published:
- Journal Name:
- Journal of the Royal Statistical Society Series B: Statistical Methodology
- Volume:
- 85
- Issue:
- 4
- ISSN:
- 1369-7412
- Format(s):
- Medium: X Size: p. 1223-1246
- Size(s):
- p. 1223-1246
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Fan, Yanan; Nott, David; Smith, Michael S; Dortet-Bernadet, Jean-Luc. (Ed.)Quantile regression is widely seen as an ideal tool to understand complex predictor-response relations. Its biggest promise rests in its ability to quantify whether and how predictor effects vary across response quantile levels. But this promise has not been fully met due to a lack of statistical estimation methods that perform a rigorous, joint analysis of all quantile levels. This gap has been recently bridged by Yang and Tokdar [18]. Here we demonstrate how their joint quantile regression method, as encoded in the R package qrjoint, offers a comprehensive and model-based regression analysis framework. This chapter is an R vignette where we illustrate how to fit models, interpret coefficients, improve and compare models and obtain predictions under this framework. Our case study is an application to ecology where we analyse how the abundance of red maple trees depends on topographical and geographical features of the location. A complete absence of the species contributes excess zeros in the response data. We treat such excess zeros as left censoring in the spirit of a Tobit regression analysis. By utilising the generative nature of the joint quantile regression model, we not only adjust for censoring but also treat it as an object of independent scientific interest.more » « less
- 
            Growth curve models have been widely used to analyse longitudinal data in social and behavioural sciences. Although growth curve models with normality assumptions are relatively easy to estimate, practical data are rarely normal. Failing to account for non‐normal data may lead to unreliable model estimation and misleading statistical inference. In this work, we propose a robust approach for growth curve modelling using conditional medians that are less sensitive to outlying observations. Bayesian methods are applied for model estimation and inference. Based on the existing work on Bayesian quantile regression using asymmetric Laplace distributions, we use asymmetric Laplace distributions to convert the problem of estimating a median growth curve model into a problem of obtaining the maximum likelihood estimator for a transformed model. Monte Carlo simulation studies have been conducted to evaluate the numerical performance of the proposed approach with data containing outliers or leverage observations. The results show that the proposed approach yields more accurate and efficient parameter estimates than traditional growth curve modelling. We illustrate the application of our robust approach using conditional medians based on a real data set from the Virginia Cognitive Aging Project.more » « less
- 
            The conditional average treatment effect (CATE) is the best measure of individual causal effects given baseline covariates. However, the CATE only captures the (conditional) average, and can overlook risks and tail events, which are important to treatment choice. In aggregate analyses, this is usually addressed by measuring the distributional treatment effect (DTE), such as differences in quantiles or tail expectations between treatment groups. Hypothetically, one can similarly fit conditional quantile regressions in each treatment group and take their difference, but this would not be robust to misspecification or provide agnostic best-in-class predictions. We provide a new robust and model-agnostic methodology for learning the conditional DTE (CDTE) for a class of problems that includes conditional quantile treatment effects, conditional super-quantile treatment effects, and conditional treatment effects on coherent risk measures given by f-divergences. Our method is based on constructing a special pseudo-outcome and regressing it on covariates using any regression learner. Our method is model-agnostic in that it can provide the best projection of CDTE onto the regression model class. Our method is robust in that even if we learn these nuisances nonparametrically at very slow rates, we can still learn CDTEs at rates that depend on the class complexity and even conduct inferences on linear projections of CDTEs. We investigate the behavior of our proposal in simulations, as well as in a case study of 401(k) eligibility effects on wealth.more » « less
- 
            Summary Quantile regression has become a widely used tool for analysing competing risk data. However, quantile regression for competing risk data with a continuous mark is still scarce. The mark variable is an extension of cause of failure in a classical competing risk model where cause of failure is replaced by a continuous mark only observed at uncensored failure times. An example of the continuous mark variable is the genetic distance that measures dissimilarity between the infecting virus and the virus contained in the vaccine construct. In this article, we propose a novel mark-specific quantile regression model. The proposed estimation method borrows strength from data in a neighbourhood of a mark and is based on an induced smoothed estimation equation, which is very different from the existing methods for competing risk data with discrete causes. The asymptotic properties of the resulting estimators are established across mark and quantile continuums. In addition, a mark-specific quantile-type vaccine efficacy is proposed and its statistical inference procedures are developed. Simulation studies are conducted to evaluate the finite sample performances of the proposed estimation and hypothesis testing procedures. An application to the first HIV vaccine efficacy trial is provided.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
