skip to main content


Title: Mixed effects envelope models

When multiple measures are collected repeatedly over time, redundancy typically exists among responses. The envelope method was recently proposed to reduce the dimension of responses without loss of information in regression with multivariate responses. It can gain substantial efficiency over the standard least squares estimator. In this paper, we generalize the envelope method to mixed effects models for longitudinal data with possibly unbalanced design and time‐varying predictors. We show that our model provides more efficient estimators than the standard estimators in mixed effects models. Improved accuracy and efficiency of the proposed method over the standard mixed effects model estimator are observed in both the simulations and the Action to Control Cardiovascular Risk in Diabetes (ACCORD) study.

 
more » « less
Award ID(s):
1916013
NSF-PAR ID:
10453340
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Stat
Volume:
9
Issue:
1
ISSN:
2049-1573
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Statistical analysis of longitudinal data often involves modeling treatment effects on clinically relevant longitudinal biomarkers since an initial event (the time origin). In some studies including preventive HIV vaccine efficacy trials, some participants have biomarkers measured starting at the time origin, whereas others have biomarkers measured starting later with the time origin unknown. The semiparametric additive time-varying coefficient model is investigated where the effects of some covariates vary nonparametrically with time while the effects of others remain constant. Weighted profile least squares estimators coupled with kernel smoothing are developed. The method uses the expectation maximization approach to deal with the censored time origin. The Kaplan–Meier estimator and other failure time regression models such as the Cox model can be utilized to estimate the distribution and the conditional distribution of left censored event time related to the censored time origin. Asymptotic properties of the parametric and nonparametric estimators and consistent asymptotic variance estimators are derived. A two-stage estimation procedure for choosing weight is proposed to improve estimation efficiency. Numerical simulations are conducted to examine finite sample properties of the proposed estimators. The simulation results show that the theory and methods work well. The efficiency gain of the two-stage estimation procedure depends on the distribution of the longitudinal error processes. The method is applied to analyze data from the Merck 023/HVTN 502 Step HIV vaccine study.

     
    more » « less
  2. Generalized linear mixed models are commonly used to describe relationships between correlated responses and covariates in medical research. In this paper, we propose a simple and easily implementable regularized estimation approach to select both fixed and random effects in generalized linear mixed model. Specifically, we propose to construct and optimize the objective functions using the confidence distributions of model parameters, as opposed to using the observed data likelihood functions, to perform effect selections. Two estimation methods are developed. The first one is to use the joint confidence distribution of model parameters to perform simultaneous fixed and random effect selections. The second method is to use the marginal confidence distributions of model parameters to perform the selections of fixed and random effects separately. With a proper choice of regularization parameters in the adaptive LASSO framework, we show the consistency and oracle properties of the proposed regularized estimators. Simulation studies have been conducted to assess the performance of the proposed estimators and demonstrate computational efficiency. Our method has also been applied to two longitudinal cancer studies to identify demographic and clinical factors associated with patient health outcomes after cancer therapies.

     
    more » « less
  3. Abstract

    We consider estimating average treatment effects (ATE) of a binary treatment in observational data when data‐driven variable selection is needed to select relevant covariates from a moderately large number of available covariates . To leverage covariates among predictive of the outcome for efficiency gain while using regularization to fit a parametric propensity score (PS) model, we consider a dimension reduction of based on fitting both working PS and outcome models using adaptive LASSO. A novel PS estimator, the Double‐index Propensity Score (DiPS), is proposed, in which the treatment status is smoothed over the linear predictors for from both the initial working models. The ATE is estimated by using the DiPS in a normalized inverse probability weighting estimator, which is found to maintain double robustness and also local semiparametric efficiency with a fixed number of covariatesp. Under misspecification of working models, the smoothing step leads to gains in efficiency and robustness over traditional doubly robust estimators. These results are extended to the case wherepdiverges with sample size and working models are sparse. Simulations show the benefits of the approach in finite samples. We illustrate the method by estimating the ATE of statins on colorectal cancer risk in an electronic medical record study and the effect of smoking on C‐reactive protein in the Framingham Offspring Study.

     
    more » « less
  4. Abstract

    Many variables of interest in agricultural or economical surveys have skewed distributions and can equal zero. Our data are measures of sheet and rill erosion called Revised Universal Soil Loss Equation2 (RUSLE2). Small area estimates of mean RUSLE2 erosion are of interest. We use a zero‐inflated lognormal mixed effects model for small area estimation. The model combines a unit‐level lognormal model for the positive RUSLE2 responses with a unit‐level logistic mixed effects model for the binary indicator that the response is nonzero. In the Conservation Effects Assessment Project (CEAP) data, counties with a higher probability of nonzero responses also tend to have a higher mean among the positive RUSLE2 values. We capture this property of the data through an assumption that the pair of random effects for a county are correlated. We develop empirical Bayes (EB) small area predictors and a bootstrap estimator of the mean squared error (MSE). In simulations, the proposed predictor is superior to simpler alternatives. We then apply the method to construct EB predictors of mean RUSLE2 erosion for South Dakota counties. To obtain auxiliary variables for the population of cropland in South Dakota, we integrate a satellite‐derived land cover map with a geographic database of soil properties. We provide an R Shiny application calledviscover(available athttps://lyux.shinyapps.io/viscover/) to visualize the overlay operations required to construct the covariates. On the basis of bootstrap estimates of the mean square error, we conclude that the EB predictors of mean RUSLE2 erosion are superior to direct estimators.

     
    more » « less
  5. Abstract

    A constrained multivariate linear model is a multivariate linear model with the columns of its coefficient matrix constrained to lie in a known subspace. This class of models includes those typically used to study growth curves and longitudinal data. Envelope methods have been proposed to improve the estimation efficiency in unconstrained multivariate linear models, but have not yet been developed for constrained models. We pursue that development in this article. We first compare the standard envelope estimator with the standard estimator arising from a constrained multivariate model in terms of bias and efficiency. To further improve efficiency, we propose a novel envelope estimator based on a constrained multivariate model. We show the advantage of our proposals by simulations and by studying the probiotic capacity to reduced Salmonella infection.

     
    more » « less