skip to main content

Title: Fast Bivariate P -Splines: The Sandwich Smoother

We propose a fast penalized spline method for bivariate smoothing. Univariate P-spline smoothers are applied simultaneously along both co-ordinates. The new smoother has a sandwich form which suggested the name ‘sandwich smoother’ to a referee. The sandwich smoother has a tensor product structure that simplifies an asymptotic analysis and it can be fast computed. We derive a local central limit theorem for the sandwich smoother, with simple expressions for the asymptotic bias and variance, by showing that the sandwich smoother is asymptotically equivalent to a bivariate kernel regression estimator with a product kernel. As far as we are aware, this is the first central limit theorem for a bivariate spline estimator of any type. Our simulation study shows that the sandwich smoother is orders of magnitude faster to compute than other bivariate spline smoothers, even when the latter are computed by using a fast generalized linear array model algorithm, and comparable with them in terms of mean integrated squared errors. We extend the sandwich smoother to array data of higher dimensions, where a generalized linear array model algorithm improves the computational speed of the sandwich smoother. One important application of the sandwich smoother is to estimate covariance functions in functional data analysis. In this application, our numerical results show that the sandwich smoother is orders of magnitude faster than local linear regression. The speed of the sandwich formula is important because functional data sets are becoming quite large.

more » « less
Author(s) / Creator(s):
; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Journal of the Royal Statistical Society Series B: Statistical Methodology
Page Range / eLocation ID:
p. 577-599
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Summary

    We extend the notion of an influence or hat matrix to regression with functional responses and scalar predictors. For responses depending linearly on a set of predictors, our definition is shown to reduce to the conventional influence matrix for linear models. The pointwise degrees of freedom, the trace of the pointwise influence matrix, are shown to have an adaptivity property that motivates a two-step bivariate smoother for modeling nonlinear dependence on a single predictor. This procedure adapts to varying complexity of the nonlinear model at different locations along the function, and thereby achieves better performance than competing tensor product smoothers in an analysis of the development of white matter microstructure in the brain.

    more » « less
  2. Summary

    Motivated by the increasing use of kernel-based metrics for high-dimensional and large-scale data, we study the asymptotic behaviour of kernel two-sample tests when the dimension and sample sizes both diverge to infinity. We focus on the maximum mean discrepancy using an isotropic kernel, which includes maximum mean discrepancy with the Gaussian kernel and the Laplace kernel, and the energy distance as special cases. We derive asymptotic expansions of the kernel two-sample statistics, based on which we establish a central limit theorem under both the null hypothesis and the local and fixed alternatives. The new nonnull central limit theorem results allow us to perform asymptotic exact power analysis, which reveals a delicate interplay between the moment discrepancy that can be detected by the kernel two-sample tests and the dimension-and-sample orders. The asymptotic theory is further corroborated through numerical studies.

    more » « less
  3. Summary

    Hansen, Kooperberg and Sardy introduced a family of continuous, piecewise linear functions defined over adaptively selected triangulations of the plane as a general approach to statistical modelling of bivariate densities and regression and hazard functions. These triograms enjoy a natural affine equivariance that offers distinct advantages over competing tensor product methods that are more commonly used in statistical applications. Triograms employ basis functions consisting of linear ‘tent functions’ defined with respect to a triangulation of a given planar domain. As in knot selection for univariate splines, Hansen and colleagues adopted the regression spline approach of Stone. Vertices of the triangulation are introduced or removed sequentially in an effort to balance fidelity to the data and parsimony. We explore a smoothing spline variant of the triogram model based on a roughness penalty adapted to the piecewise linear structure of the triogram model. We show that the roughness penalty proposed may be interpreted as a total variation penalty on the gradient of the fitted function. The methods are illustrated with real and artificial examples, including an application to estimated quantile surfaces of land value in the Chicago metropolitan area.

    more » « less
  4. The statistical practice of modeling interaction with two linear main effects and a product term is ubiquitous in the statistical and epidemiological literature. Most data modelers are aware that the misspecification of main effects can potentially cause severe type I error inflation in tests for interactions, leading to spurious detection of interactions. However, modeling practice has not changed. In this article, we focus on the specific situation where the main effects in the model are misspecified as linear terms and characterize its impact on common tests for statistical interaction. We then propose some simple alternatives that fix the issue of potential type I error inflation in testing interaction due to main effect misspecification. We show that when using the sandwich variance estimator for a linear regression model with a quantitative outcome and two independent factors, both the Wald and score tests asymptotically maintain the correct type I error rate. However, if the independence assumption does not hold or the outcome is binary, using the sandwich estimator does not fix the problem. We further demonstrate that flexibly modeling the main effect under a generalized additive model can largely reduce or often remove bias in the estimates and maintain the correct type I error rate for both quantitative and binary outcomes regardless of the independence assumption. We show, under the independence assumption and for a continuous outcome, overfitting and flexibly modeling the main effects does not lead to power loss asymptotically relative to a correctly specified main effect model. Our simulation study further demonstrates the empirical fact that using flexible models for the main effects does not result in a significant loss of power for testing interaction in general. Our results provide an improved understanding of the strengths and limitations for tests of interaction in the presence of main effect misspecification. Using data from a large biobank study “The Michigan Genomics Initiative”, we present two examples of interaction analysis in support of our results.

    more » « less
  5. Abstract

    All-atom dynamics simulations are an indispensable quantitative tool in physics, chemistry, and materials science, but large systems and long simulation times remain challenging due to the trade-off between computational efficiency and predictive accuracy. To address this challenge, we combine effective two- and three-body potentials in a cubic B-spline basis with regularized linear regression to obtain machine-learning potentials that are physically interpretable, sufficiently accurate for applications, as fast as the fastest traditional empirical potentials, and two to four orders of magnitude faster than state-of-the-art machine-learning potentials. For data from empirical potentials, we demonstrate the exact retrieval of the potential. For data from density functional theory, the predicted energies, forces, and derived properties, including phonon spectra, elastic constants, and melting points, closely match those of the reference method. The introduced potentials might contribute towards accurate all-atom dynamics simulations of large atomistic systems over long-time scales.

    more » « less