skip to main content


Title: Penalized Triograms: Total Variation Regularization for Bivariate Smoothing
Summary

Hansen, Kooperberg and Sardy introduced a family of continuous, piecewise linear functions defined over adaptively selected triangulations of the plane as a general approach to statistical modelling of bivariate densities and regression and hazard functions. These triograms enjoy a natural affine equivariance that offers distinct advantages over competing tensor product methods that are more commonly used in statistical applications. Triograms employ basis functions consisting of linear ‘tent functions’ defined with respect to a triangulation of a given planar domain. As in knot selection for univariate splines, Hansen and colleagues adopted the regression spline approach of Stone. Vertices of the triangulation are introduced or removed sequentially in an effort to balance fidelity to the data and parsimony. We explore a smoothing spline variant of the triogram model based on a roughness penalty adapted to the piecewise linear structure of the triogram model. We show that the roughness penalty proposed may be interpreted as a total variation penalty on the gradient of the fitted function. The methods are illustrated with real and artificial examples, including an application to estimated quantile surfaces of land value in the Chicago metropolitan area.

 
more » « less
NSF-PAR ID:
10404444
Author(s) / Creator(s):
;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Journal of the Royal Statistical Society Series B: Statistical Methodology
Volume:
66
Issue:
1
ISSN:
1369-7412
Format(s):
Medium: X Size: p. 145-163
Size(s):
["p. 145-163"]
Sponsoring Org:
National Science Foundation
More Like this
  1.  

    A new family of penalty functions, ie, adaptive to likelihood, is introduced for model selection in general regression models. It arises naturally through assuming certain types of prior distribution on the regression parameters. To study the stability properties of the penalized maximum‐likelihood estimator, 2 types of asymptotic stability are defined. Theoretical properties, including the parameter estimation consistency, model selection consistency, and asymptotic stability, are established under suitable regularity conditions. An efficient coordinate‐descent algorithm is proposed. Simulation results and real data analysis show that the proposed approach has competitive performance in comparison with the existing methods.

     
    more » « less
  2. null (Ed.)
    When interpolating data with certain regularity, spline functions are useful. They are defined as piecewise polynomials that satisfy certain regularity conditions at the joints. In the literature about splines it is possible to find several references that study the apparition of Gibbs phenomenon close to jump discontinuities in the results obtained by spline interpolation. This work is devoted to the construction and analysis of a new nonlinear technique that allows to improve the accuracy of splines near jump discontinuities eliminating the Gibbs phenomenon. The adaption is easily attained through a nonlinear modification of the right hand side of the system of equations of the spline, that contains divided differences. The modification is based on the use of a new limiter specifically designed to attain adaption close to jumps in the function. The new limiter can be seen as a nonlinear weighted mean that has better adaption properties than the linear weighted mean. We will prove that the nonlinear modification introduced in the spline keeps the maximum theoretical accuracy in all the domain except at the intervals that contain a jump discontinuity, where Gibbs oscillations are eliminated. Diffusion is introduced, but this is fine if the discontinuity appears due to a discretization of a high gradient with not enough accuracy. The new technique is introduced for cubic splines, but the theory presented allows to generalize the results very easily to splines of any order. The experiments presented satisfy the theoretical aspects analyzed in the paper. 
    more » « less
  3. Abstract

    Linear quantile regression is a powerful tool to investigate how predictors may affect a response heterogeneously across different quantile levels. Unfortunately, existing approaches find it extremely difficult to adjust for any dependency between observation units, largely because such methods are not based upon a fully generative model of the data. For analysing spatially indexed data, we address this difficulty by generalizing the joint quantile regression model of Yang and Tokdar (Journal of the American Statistical Association, 2017, 112(519), 1107–1120) and characterizing spatial dependence via a Gaussian or t-copula process on the underlying quantile levels of the observation units. A Bayesian semiparametric approach is introduced to perform inference of model parameters and carry out spatial quantile smoothing. An effective model comparison criteria is provided, particularly for selecting between different model specifications of tail heaviness and tail dependence. Extensive simulation studies and two real applications to particulate matter concentration and wildfire risk are presented to illustrate substantial gains in inference quality, prediction accuracy and uncertainty quantification over existing alternatives.

     
    more » « less
  4. Summary

    We introduce a flexible marginal modelling approach for statistical inference for clustered and longitudinal data under minimal assumptions. This estimated estimating equations approach is semiparametric and the proposed models are fitted by quasi-likelihood regression, where the unknown marginal means are a function of the fixed effects linear predictor with unknown smooth link, and variance–covariance is an unknown smooth function of the marginal means. We propose to estimate the nonparametric link and variance–covariance functions via smoothing methods, whereas the regression parameters are obtained via the estimated estimating equations. These are score equations that contain nonparametric function estimates. The proposed estimated estimating equations approach is motivated by its flexibility and easy implementation. Moreover, if data follow a generalized linear mixed model, with either a specified or an unspecified distribution of random effects and link function, the model proposed emerges as the corresponding marginal (population-average) version and can be used to obtain inference for the fixed effects in the underlying generalized linear mixed model, without the need to specify any other components of this generalized linear mixed model. Among marginal models, the estimated estimating equations approach provides a flexible alternative to modelling with generalized estimating equations. Applications of estimated estimating equations include diagnostics and link selection. The asymptotic distribution of the proposed estimators for the model parameters is derived, enabling statistical inference. Practical illustrations include Poisson modelling of repeated epileptic seizure counts and simulations for clustered binomial responses.

     
    more » « less
  5. Abstract

    ℓ 1 -penalized quantile regression (QR) is widely used for analysing high-dimensional data with heterogeneity. It is now recognized that the ℓ1-penalty introduces non-negligible estimation bias, while a proper use of concave regularization may lead to estimators with refined convergence rates and oracle properties as the signal strengthens. Although folded concave penalized M-estimation with strongly convex loss functions have been well studied, the extant literature on QR is relatively silent. The main difficulty is that the quantile loss is piecewise linear: it is non-smooth and has curvature concentrated at a single point. To overcome the lack of smoothness and strong convexity, we propose and study a convolution-type smoothed QR with iteratively reweighted ℓ1-regularization. The resulting smoothed empirical loss is twice continuously differentiable and (provably) locally strongly convex with high probability. We show that the iteratively reweighted ℓ1-penalized smoothed QR estimator, after a few iterations, achieves the optimal rate of convergence, and moreover, the oracle rate and the strong oracle property under an almost necessary and sufficient minimum signal strength condition. Extensive numerical studies corroborate our theoretical results.

     
    more » « less