In survival regression analysis, when the time-dependent covariates are censored and measured with errors, a joint model is often considered for the longitudinal covariate data and the survival data. Typically, an empirical linear (mixed) model is assumed for the time-dependent covariates. However, such an empirical linear covariate model may be inappropriate for the (unobserved) censored covariate values that may behave quite differently from the observed covariate process. In applications such as human immunodeficiency virus–acquired immune deficiency syndrome studies, a mechanistic non-linear model can be derived for the covariate process on the basis of the underlying data generation mechanisms and such a non-linear covariate model may provide better ‘predictions’ for the censored and mismeasured covariate values. We propose a joint Cox and non-linear mixed effect model to model survival data with censored and mismeasured time varying covariates. We use likelihood methods for inference, implemented by the Monte Carlo EM algorithm. The models and methods are evaluated by simulations. An acquired immune deficiency syndrome data set is analysed in detail, where the time-dependent covariate is a viral load which may be censored because of a lower detection limit and may also be measured with errors. The results based on linear and non-linear covariate models are compared and new insights are gained.
We consider the situation where the random effects in a generalized linear mixed model may be correlated with one of the predictors, which leads to inconsistent estimators. We show that conditional maximum likelihood can eliminate this bias. Conditional likelihood leads naturally to the partitioning of the covariate into between- and within-cluster components and models that include separate terms for these components also eliminate the source of the bias. Another viewpoint that we develop is the idea that many violations of the assumptions (including correlation between the random effects and a covariate) in a generalized linear mixed model may be cast as misspecified mixing distributions. We illustrate the results with two examples and simulations.
more » « less- PAR ID:
- 10405574
- Publisher / Repository:
- Oxford University Press
- Date Published:
- Journal Name:
- Journal of the Royal Statistical Society Series B: Statistical Methodology
- Volume:
- 68
- Issue:
- 5
- ISSN:
- 1369-7412
- Format(s):
- Medium: X Size: p. 859-872
- Size(s):
- p. 859-872
- Sponsoring Org:
- National Science Foundation
More Like this
-
Summary -
Summary We introduce a flexible marginal modelling approach for statistical inference for clustered and longitudinal data under minimal assumptions. This estimated estimating equations approach is semiparametric and the proposed models are fitted by quasi-likelihood regression, where the unknown marginal means are a function of the fixed effects linear predictor with unknown smooth link, and variance–covariance is an unknown smooth function of the marginal means. We propose to estimate the nonparametric link and variance–covariance functions via smoothing methods, whereas the regression parameters are obtained via the estimated estimating equations. These are score equations that contain nonparametric function estimates. The proposed estimated estimating equations approach is motivated by its flexibility and easy implementation. Moreover, if data follow a generalized linear mixed model, with either a specified or an unspecified distribution of random effects and link function, the model proposed emerges as the corresponding marginal (population-average) version and can be used to obtain inference for the fixed effects in the underlying generalized linear mixed model, without the need to specify any other components of this generalized linear mixed model. Among marginal models, the estimated estimating equations approach provides a flexible alternative to modelling with generalized estimating equations. Applications of estimated estimating equations include diagnostics and link selection. The asymptotic distribution of the proposed estimators for the model parameters is derived, enabling statistical inference. Practical illustrations include Poisson modelling of repeated epileptic seizure counts and simulations for clustered binomial responses.
-
Summary In longitudinal data analysis one frequently encounters non-Gaussian data that are repeatedly collected for a sample of individuals over time. The repeated observations could be binomial, Poisson or of another discrete type or could be continuous. The timings of the repeated measurements are often sparse and irregular. We introduce a latent Gaussian process model for such data, establishing a connection to functional data analysis. The functional methods proposed are non-parametric and computationally straightforward as they do not involve a likelihood. We develop functional principal components analysis for this situation and demonstrate the prediction of individual trajectories from sparse observations. This method can handle missing data and leads to predictions of the functional principal component scores which serve as random effects in this model. These scores can then be used for further statistical analysis, such as inference, regression, discriminant analysis or clustering. We illustrate these non-parametric methods with longitudinal data on primary biliary cirrhosis and show in simulations that they are competitive in comparisons with generalized estimating equations and generalized linear mixed models.
-
Summary Covariate-adaptive randomization is popular in clinical trials with sequentially arrived patients for balancing treatment assignments across prognostic factors that may have influence on the response. However, existing theory on tests for the treatment effect under covariate-adaptive randomization is limited to tests under linear or generalized linear models, although the covariate-adaptive randomization method has been used in survival analysis for a long time. Often, practitioners will simply adopt a conventional test to compare two treatments, which is controversial since tests derived under simple randomization may not be valid in terms of type I error under other randomization schemes. We derive the asymptotic distribution of the partial likelihood score function under covariate-adaptive randomization and a working model that is subject to possible model misspecification. Using this general result, we prove that the partial likelihood score test that is robust against model misspecification under simple randomization is no longer robust but conservative under covariate-adaptive randomization. We also show that the unstratified log-rank test is conservative and the stratified log-rank test remains valid under covariate-adaptive randomization. We propose a modification to variance estimation in the partial likelihood score test, which leads to a score test that is valid and robust against arbitrary model misspecification under a large family of covariate-adaptive randomization schemes including simple randomization. Furthermore, we show that the modified partial likelihood score test derived under a correctly specified model is more powerful than log-rank-type tests in terms of Pitman’s asymptotic relative efficiency. Simulation studies about the type I error and power of various tests are presented under several popular randomization schemes.
-
Abstract We propose a Bayesian model selection approach for generalized linear mixed models (GLMMs). We consider covariance structures for the random effects that are widely used in areas such as longitudinal studies, genome-wide association studies, and spatial statistics. Since the random effects cannot be integrated out of GLMMs analytically, we approximate the integrated likelihood function using a pseudo-likelihood approach. Our Bayesian approach assumes a flat prior for the fixed effects and includes both approximate reference prior and half-Cauchy prior choices for the variances of random effects. Since the flat prior on the fixed effects is improper, we develop a fractional Bayes factor approach to obtain posterior probabilities of the several competing models. Simulation studies with Poisson GLMMs with spatial random effects and overdispersion random effects show that our approach performs favorably when compared to widely used competing Bayesian methods including deviance information criterion and Watanabe–Akaike information criterion. We illustrate the usefulness and flexibility of our approach with three case studies including a Poisson longitudinal model, a Poisson spatial model, and a logistic mixed model. Our proposed approach is implemented in the R package GLMMselect that is available on CRAN.