skip to main content


Title: Time‐varying feature selection for longitudinal analysis

We propose time‐varying coefficient model selection and estimation based on the spline approach, which is capable of capturing time‐dependent covariate effects. The new penalty function utilizes local‐region information for varying‐coefficient estimation, in contrast to the traditional model selection approach focusing on the entire region. The proposed method is extremely useful when the signals associated with relevant predictors are time‐dependent, and detecting relevant covariate effects in the local region is more scientifically relevant than those of the entire region. Our simulation studies indicate that the proposed model selection incorporating local features outperforms the global feature model selection approaches. The proposed method is also illustrated through a longitudinal growth and health study from National Heart, Lung, and Blood Institute.

 
more » « less
Award ID(s):
1821198 2019461 1812258
NSF-PAR ID:
10460762
Author(s) / Creator(s):
 ;  ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Statistics in Medicine
Volume:
39
Issue:
2
ISSN:
0277-6715
Page Range / eLocation ID:
p. 156-170
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Summary

    This article investigates a generalized semiparametric varying-coefficient model for longitudinal data that can flexibly model three types of covariate effects: time-constant effects, time-varying effects, and covariate-varying effects. Different link functions can be selected to provide a rich family of models for longitudinal data. The model assumes that the time-varying effects are unspecified functions of time and the covariate-varying effects are parametric functions of an exposure variable specified up to a finite number of unknown parameters. The estimation procedure is developed using local linear smoothing and profile weighted least squares estimation techniques. Hypothesis testing procedures are developed to test the parametric functions of the covariate-varying effects. The asymptotic distributions of the proposed estimators are established. A working formula for bandwidth selection is discussed and examined through simulations. Our simulation study shows that the proposed methods have satisfactory finite sample performance. The proposed methods are applied to the ACTG 244 clinical trial of HIV infected patients being treated with Zidovudine to examine the effects of antiretroviral treatment switching before and after HIV develops the T215Y/F drug resistance mutation. Our analysis shows benefits of treatment switching to the combination therapies as compared to continuing with ZDV monotherapy before and after developing the 215-mutation.

     
    more » « less
  2. Abstract

    The generalized semiparametric mixed varying‐coefficient effects model for longitudinal data can accommodate a variety of link functions and flexibly model different types of covariate effects, including time‐constant, time‐varying and covariate‐varying effects. The time‐varying effects are unspecified functions of time and the covariate‐varying effects are nonparametric functions of a possibly time‐dependent exposure variable. A semiparametric estimation procedure is developed that uses local linear smoothing and profile weighted least squares, which requires smoothing in the two different and yet connected domains of time and the time‐dependent exposure variable. The asymptotic properties of the estimators of both nonparametric and parametric effects are investigated. In addition, hypothesis testing procedures are developed to examine the covariate effects. The finite‐sample properties of the proposed estimators and testing procedures are examined through simulations, indicating satisfactory performances. The proposed methods are applied to analyze the AIDS Clinical Trial Group 244 clinical trial to investigate the effects of antiretroviral treatment switching in HIV‐infected patients before and after developing the T215Y antiretroviral drug resistance mutation.The Canadian Journal of Statistics47: 352–373; 2019 © 2019 Statistical Society of Canada

     
    more » « less
  3. This paper investigates the semiparametric statistical methods for recurrent events. The mean number of the recurrent events are modeled with the generalized semiparametric varying‐coefficient model that can flexibly model three types of covariate effects: time‐constant effects, time‐varying effects, and covariate‐varying effects. We assume that the time‐varying effects are unspecified functions of time and the covariate‐varying effects are parametric functions of an exposure variable specified up to a finite number of unknown parameters. Different link functions can be selected to provide a rich family of models for recurrent events data. The profile estimation methods are developed for the parametric and nonparametric components. The asymptotic properties are established. We also develop some hypothesis testing procedures to test validity of the parametric forms of covariate‐varying effects. The simulation study shows that both estimation and hypothesis testing procedures perform well. The proposed method is applied to analyze a data set from an acyclovir study and investigate whether acyclovir treatment reduces the mean relapse recurrences.

     
    more » « less
  4. Spatial–temporal data arise frequently in biomedical, environmental, political and social science studies. Capturing dynamic changes of time-varying correlation structure is scientifically important in spatio-temporal data analysis. We approximate the time-varying empirical estimator of the spatial correlation matrix by groups of selected basis matrices representing substructures of the correlation matrix. After projecting the correlation structure matrix onto a space spanned by basis matrices, we also incorporate varying-coefficient model selection and estimation for signals associated with relevant basis matrices. The unique feature of the proposed method is that signals at local regions corresponding with time can be identified through the proposed penalized objective function. Theoretically, we show model selection consistency and the oracle property in detecting local signals for the varying-coefficient estimators. The proposed method is illustrated through simulation studies and brain fMRI data. 
    more » « less
  5. Abstract

    Statistical analysis of longitudinal data often involves modeling treatment effects on clinically relevant longitudinal biomarkers since an initial event (the time origin). In some studies including preventive HIV vaccine efficacy trials, some participants have biomarkers measured starting at the time origin, whereas others have biomarkers measured starting later with the time origin unknown. The semiparametric additive time-varying coefficient model is investigated where the effects of some covariates vary nonparametrically with time while the effects of others remain constant. Weighted profile least squares estimators coupled with kernel smoothing are developed. The method uses the expectation maximization approach to deal with the censored time origin. The Kaplan–Meier estimator and other failure time regression models such as the Cox model can be utilized to estimate the distribution and the conditional distribution of left censored event time related to the censored time origin. Asymptotic properties of the parametric and nonparametric estimators and consistent asymptotic variance estimators are derived. A two-stage estimation procedure for choosing weight is proposed to improve estimation efficiency. Numerical simulations are conducted to examine finite sample properties of the proposed estimators. The simulation results show that the theory and methods work well. The efficiency gain of the two-stage estimation procedure depends on the distribution of the longitudinal error processes. The method is applied to analyze data from the Merck 023/HVTN 502 Step HIV vaccine study.

     
    more » « less