skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Thursday, June 13 until 2:00 AM ET on Friday, June 14 due to maintenance. We apologize for the inconvenience.

Title: Interaction analysis under misspecification of main effects: Some common mistakes and simple solutions

The statistical practice of modeling interaction with two linear main effects and a product term is ubiquitous in the statistical and epidemiological literature. Most data modelers are aware that the misspecification of main effects can potentially cause severe type I error inflation in tests for interactions, leading to spurious detection of interactions. However, modeling practice has not changed. In this article, we focus on the specific situation where the main effects in the model are misspecified as linear terms and characterize its impact on common tests for statistical interaction. We then propose some simple alternatives that fix the issue of potential type I error inflation in testing interaction due to main effect misspecification. We show that when using the sandwich variance estimator for a linear regression model with a quantitative outcome and two independent factors, both the Wald and score tests asymptotically maintain the correct type I error rate. However, if the independence assumption does not hold or the outcome is binary, using the sandwich estimator does not fix the problem. We further demonstrate that flexibly modeling the main effect under a generalized additive model can largely reduce or often remove bias in the estimates and maintain the correct type I error rate for both quantitative and binary outcomes regardless of the independence assumption. We show, under the independence assumption and for a continuous outcome, overfitting and flexibly modeling the main effects does not lead to power loss asymptotically relative to a correctly specified main effect model. Our simulation study further demonstrates the empirical fact that using flexible models for the main effects does not result in a significant loss of power for testing interaction in general. Our results provide an improved understanding of the strengths and limitations for tests of interaction in the presence of main effect misspecification. Using data from a large biobank study “The Michigan Genomics Initiative”, we present two examples of interaction analysis in support of our results.

more » « less
Award ID(s):
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Statistics in Medicine
Page Range / eLocation ID:
p. 1675-1694
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    “Covariate adjustment” in the randomized trial context refers to an estimator of the average treatment effect that adjusts for chance imbalances between study arms in baseline variables (called “covariates”). The baseline variables could include, for example, age, sex, disease severity, and biomarkers. According to two surveys of clinical trial reports, there is confusion about the statistical properties of covariate adjustment. We focus on the analysis of covariance (ANCOVA) estimator, which involves fitting a linear model for the outcome given the treatment arm and baseline variables, and trials that use simple randomization with equal probability of assignment to treatment and control. We prove the following new (to the best of our knowledge) robustness property of ANCOVA to arbitrary model misspecification: Not only is the ANCOVA point estimate consistent (as proved by Yang and Tsiatis, 2001) but so is its standard error. This implies that confidence intervals and hypothesis tests conducted as if the linear model were correct are still asymptotically valid even when the linear model is arbitrarily misspecified, for example, when the baseline variables are nonlinearly related to the outcome or there is treatment effect heterogeneity. We also give a simple, robust formula for the variance reduction (equivalently, sample size reduction) from using ANCOVA. By reanalyzing completed randomized trials for mild cognitive impairment, schizophrenia, and depression, we demonstrate how ANCOVA can achieve variance reductions of 4 to 32%.

    more » « less
  2. Summary

    Some covariate-adaptive randomization methods have been used in clinical trials for a long time, but little theoretical work has been done about testing hypotheses under covariate-adaptive randomization until Shao et al. (2010) who provided a theory with detailed discussion for responses under linear models. In this article, we establish some asymptotic results for covariate-adaptive biased coin randomization under generalized linear models with possibly unknown link functions. We show that the simple t-test without using any covariate is conservative under covariate-adaptive biased coin randomization in terms of its Type I error rate, and that a valid test using the bootstrap can be constructed. This bootstrap test, utilizing covariates in the randomization scheme, is shown to be asymptotically as efficient as Wald's test correctly using covariates in the analysis. Thus, the efficiency loss due to not using covariates in the analysis can be recovered by utilizing covariates in covariate-adaptive biased coin randomization. Our theory is illustrated with two most popular types of discrete outcomes, binary responses and event counts under the Poisson model, and exponentially distributed continuous responses. We also show that an alternative simple test without using any covariate under the Poisson model has an inflated Type I error rate under simple randomization, but is valid under covariate-adaptive biased coin randomization. Effects on the validity of tests due to model misspecification is also discussed. Simulation studies about the Type I errors and powers of several tests are presented for both discrete and continuous responses.

    more » « less
  3. In spite of its urgent importance in the era of big data, testing high-dimensional parameters in generalized linear models (GLMs) in the presence of high-dimensional nuisance parameters has been largely under-studied, especially with regard to constructing powerful tests for general (and unknown) alternatives. Most existing tests are powerful only against certain alternatives and may yield incorrect Type I error rates under high-dimensional nuisance parameter situations. In this paper, we propose the adaptive interaction sum of powered score (aiSPU) test in the framework of penalized regression with a non-convex penalty, called truncated Lasso penalty (TLP), which can maintain correct Type I error rates while yielding high statistical power across a wide range of alternatives. To calculate its p-values analytically, we derive its asymptotic null distribution. Via simulations, its superior finite-sample performance is demonstrated over several representative existing methods. In addition, we apply it and other representative tests to an Alzheimer’s Disease Neuroimaging Initiative (ADNI) data set, detecting possible gene-gender interactions for Alzheimer’s disease. We also put R package “aispu” implementing the proposed test on GitHub. 
    more » « less
  4. Summary

    Covariate-adaptive randomization is popular in clinical trials with sequentially arrived patients for balancing treatment assignments across prognostic factors that may have influence on the response. However, existing theory on tests for the treatment effect under covariate-adaptive randomization is limited to tests under linear or generalized linear models, although the covariate-adaptive randomization method has been used in survival analysis for a long time. Often, practitioners will simply adopt a conventional test to compare two treatments, which is controversial since tests derived under simple randomization may not be valid in terms of type I error under other randomization schemes. We derive the asymptotic distribution of the partial likelihood score function under covariate-adaptive randomization and a working model that is subject to possible model misspecification. Using this general result, we prove that the partial likelihood score test that is robust against model misspecification under simple randomization is no longer robust but conservative under covariate-adaptive randomization. We also show that the unstratified log-rank test is conservative and the stratified log-rank test remains valid under covariate-adaptive randomization. We propose a modification to variance estimation in the partial likelihood score test, which leads to a score test that is valid and robust against arbitrary model misspecification under a large family of covariate-adaptive randomization schemes including simple randomization. Furthermore, we show that the modified partial likelihood score test derived under a correctly specified model is more powerful than log-rank-type tests in terms of Pitman’s asymptotic relative efficiency. Simulation studies about the type I error and power of various tests are presented under several popular randomization schemes.

    more » « less
  5. Abstract

    We consider estimating average treatment effects (ATE) of a binary treatment in observational data when data‐driven variable selection is needed to select relevant covariates from a moderately large number of available covariates . To leverage covariates among predictive of the outcome for efficiency gain while using regularization to fit a parametric propensity score (PS) model, we consider a dimension reduction of based on fitting both working PS and outcome models using adaptive LASSO. A novel PS estimator, the Double‐index Propensity Score (DiPS), is proposed, in which the treatment status is smoothed over the linear predictors for from both the initial working models. The ATE is estimated by using the DiPS in a normalized inverse probability weighting estimator, which is found to maintain double robustness and also local semiparametric efficiency with a fixed number of covariatesp. Under misspecification of working models, the smoothing step leads to gains in efficiency and robustness over traditional doubly robust estimators. These results are extended to the case wherepdiverges with sample size and working models are sparse. Simulations show the benefits of the approach in finite samples. We illustrate the method by estimating the ATE of statins on colorectal cancer risk in an electronic medical record study and the effect of smoking on C‐reactive protein in the Framingham Offspring Study.

    more » « less