skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Profile estimation of generalized semiparametric varying-coefficient additive models for longitudinal data with within-subject correlations
In this paper, we study several profile estimation methods for the generalized semiparametric varying-coefficient additive model for longitudinal data by utilizing the within-subject correlations. The model is flexible in allowing timevarying effects for some covariates and constant effects for others, and in having the option to choose different link functions which can used to analyze both discrete and continuous longitudinal responses.We investigated the profile generalized estimating equation (GEE) approaches and the profile quadratic inference function (QIF) approach. The profile estimations are assisted with the local linear smoothing technique to estimate the time-varying effects. Several approaches that incorporate the within-subject correlations are investigated including the quasi-likelihood (QL), the minimum generalized variance (MGV), the quadratic inference function and the weighted least squares (WLS). The proposed estimation procedures can accommodate flexible sampling schemes. These methods provide a unified approach that work well for discrete longitudinal responses as well as for continuous longitudinal responses. Finite sample performances of these methods are examined through Monto Carlo simulations under various correlation structures for both discrete and continuous longitudinal responses. The simulation results show efficiency improvement over the working independence approach by utilizing the within-subject correlations as well as comparative performances of different approaches.  more » « less
Award ID(s):
1915829
PAR ID:
10469662
Author(s) / Creator(s):
Publisher / Repository:
Springer
Date Published:
ISSN:
2199-0980
ISBN:
978-3-031-08328-0
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Longitudinal studies with binary or ordinal responses are widely encountered in various disciplines, where the primary focus is on the temporal evolution of the probability of each response category. Traditional approaches build from the generalized mixed effects modeling framework. Even amplified with nonparametric priors placed on the fixed or random effects, such models are restrictive due to the implied assumptions on the marginal expectation and covariance structure of the responses. We tackle the problem from a functional data analysis perspective, treating the observations for each subject as realizations from subject-specific stochastic processes at the measured times. We develop the methodology focusing initially on binary responses, for which we assume the stochastic processes have Binomial marginal distributions. Leveraging the logits representation, we model the discrete space processes through continuous space processes. We utilize a hierarchical framework to model the mean and covariance kernel of the continuous space processes nonparametrically and simultaneously through a Gaussian process prior and an Inverse-Wishart process prior, respectively. The prior structure results in flexible inference for the evolution and correlation of binary responses, while allowing for borrowing of strength across all subjects. The modeling approach can be naturally extended to ordinal responses. Here, the continuation-ratio logits factorization of the multinomial distribution is key for efficient modeling and inference, including a practical way of dealing with unbalanced longitudinal data. The methodology is illustrated with synthetic data examples and an analysis of college students’ mental health status data. 
    more » « less
  2. Abstract The generalized semiparametric mixed varying‐coefficient effects model for longitudinal data can accommodate a variety of link functions and flexibly model different types of covariate effects, including time‐constant, time‐varying and covariate‐varying effects. The time‐varying effects are unspecified functions of time and the covariate‐varying effects are nonparametric functions of a possibly time‐dependent exposure variable. A semiparametric estimation procedure is developed that uses local linear smoothing and profile weighted least squares, which requires smoothing in the two different and yet connected domains of time and the time‐dependent exposure variable. The asymptotic properties of the estimators of both nonparametric and parametric effects are investigated. In addition, hypothesis testing procedures are developed to examine the covariate effects. The finite‐sample properties of the proposed estimators and testing procedures are examined through simulations, indicating satisfactory performances. The proposed methods are applied to analyze the AIDS Clinical Trial Group 244 clinical trial to investigate the effects of antiretroviral treatment switching in HIV‐infected patients before and after developing the T215Y antiretroviral drug resistance mutation.The Canadian Journal of Statistics47: 352–373; 2019 © 2019 Statistical Society of Canada 
    more » « less
  3. Abstract The availability of vast amounts of longitudinal data from electronic health records (EHRs) and personal wearable devices opens the door to numerous new research questions. In many studies, individual variability of a longitudinal outcome is as important as the mean. Blood pressure fluctuations, glycemic variations, and mood swings are prime examples where it is critical to identify factors that affect the within‐individual variability. We propose a scalable method, within‐subject variance estimator by robust regression (WiSER), for the estimation and inference of the effects of both time‐varying and time‐invariant predictors on within‐subject variance. It is robust against the misspecification of the conditional distribution of responses or the distribution of random effects. It shows similar performance as the correctly specified likelihood methods but is 103∼ 105times faster. The estimation algorithm scales linearly in the total number of observations, making it applicable to massive longitudinal data sets. The effectiveness of WiSER is evaluated in extensive simulation studies. Its broad applicability is illustrated using the accelerometry data from the Women's Health Study and a clinical trial for longitudinal diabetes care. 
    more » « less
  4. Roozbeh, Mahdi (Ed.)
    The genetic basis of complex traits involves the function of many genes with small effects as well as complex gene-gene and gene-environment interactions. As one of the major players in complex diseases, the role of gene-environment interactions has been increasingly recognized. Motivated by epidemiology studies to evaluate the joint effect of environmental mixtures, we developed a functional varying-index coefficient model (FVICM) to assess the combined effect of environmental mixtures and their interactions with genes, under a longitudinal design with quantitative traits. Built upon the previous work, we extend the FVICM model to accommodate binary longitudinal traits through the development of a generalized functional varying-index coefficient model (gFVICM). This model examines how the genetic effects on a disease trait are nonlinearly influenced by a combination of environmental factors. We derive an estimation procedure for the varying-index coefficient functions using quadratic inference functions combined with penalized splines. A hypothesis testing procedure is proposed to evaluate the significance of the nonparametric index functions. Extensive Monte Carlo simulations are conducted to evaluate the performance of the method under finite samples. The utility of the method is further demonstrated through a case study with a pain sensitivity dataset. SNPs were found to have their effects on blood pressure nonlinearly influenced by a combination of environmental factors. 
    more » « less
  5. Abstract Statistical analysis of longitudinal data often involves modeling treatment effects on clinically relevant longitudinal biomarkers since an initial event (the time origin). In some studies including preventive HIV vaccine efficacy trials, some participants have biomarkers measured starting at the time origin, whereas others have biomarkers measured starting later with the time origin unknown. The semiparametric additive time-varying coefficient model is investigated where the effects of some covariates vary nonparametrically with time while the effects of others remain constant. Weighted profile least squares estimators coupled with kernel smoothing are developed. The method uses the expectation maximization approach to deal with the censored time origin. The Kaplan–Meier estimator and other failure time regression models such as the Cox model can be utilized to estimate the distribution and the conditional distribution of left censored event time related to the censored time origin. Asymptotic properties of the parametric and nonparametric estimators and consistent asymptotic variance estimators are derived. A two-stage estimation procedure for choosing weight is proposed to improve estimation efficiency. Numerical simulations are conducted to examine finite sample properties of the proposed estimators. The simulation results show that the theory and methods work well. The efficiency gain of the two-stage estimation procedure depends on the distribution of the longitudinal error processes. The method is applied to analyze data from the Merck 023/HVTN 502 Step HIV vaccine study. 
    more » « less