skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Thursday, October 10 until 2:00 AM ET on Friday, October 11 due to maintenance. We apologize for the inconvenience.


Title: Efficiency of naive estimators for accelerated failure time models under length‐biased sampling
Abstract

In prevalent cohort studies where subjects are recruited at a cross‐section, the time to an event may be subject to length‐biased sampling, with the observed data being either the forward recurrence time, or the backward recurrence time, or their sum. In the regression setting, assuming a semiparametric accelerated failure time model for the underlying event time, where the intercept parameter is absorbed into the nuisance parameter, it has been shown that the model remains invariant under these observed data setups and can be fitted using standard methodology for accelerated failure time model estimation, ignoring the length bias. However, the efficiency of these estimators is unclear, owing to the fact that the observed covariate distribution, which is also length biased, may contain information about the regression parameter in the accelerated life model. We demonstrate that if the true covariate distribution is completely unspecified, then the naive estimator based on the conditional likelihood given the covariates is fully efficient for the slope.

 
more » « less
NSF-PAR ID:
10367239
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
Wiley-Blackwell
Date Published:
Journal Name:
Scandinavian Journal of Statistics
Volume:
49
Issue:
2
ISSN:
0303-6898
Format(s):
Medium: X Size: p. 525-541
Size(s):
p. 525-541
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Censored quantile regression models, which offer great flexibility in assessing covariate effects on event times, have attracted considerable research interest. In this study, we consider flexible estimation and inference procedures for competing risks quantile regression, which not only provides meaningful interpretations by using cumulative incidence quantiles but also extends the conventional accelerated failure time model by relaxing some of the stringent model assumptions, such as global linearity and unconditional independence. Current method for censored quantile regressions often involves the minimization of theL1‐type convex function or solving the nonsmoothed estimating equations. This approach could lead to multiple roots in practical settings, particularly with multiple covariates. Moreover, variance estimation involves an unknown error distribution and most methods rely on computationally intensive resampling techniques such as bootstrapping. We consider the induced smoothing procedure for censored quantile regressions to the competing risks setting. The proposed procedure permits the fast and accurate computation of quantile regression parameter estimates and standard variances by using conventional numerical methods such as the Newton–Raphson algorithm. Numerical studies show that the proposed estimators perform well and the resulting inference is reliable in practical settings. The method is finally applied to data from a soft tissue sarcoma study.

     
    more » « less
  2. Summary

    In survival regression analysis, when the time-dependent covariates are censored and measured with errors, a joint model is often considered for the longitudinal covariate data and the survival data. Typically, an empirical linear (mixed) model is assumed for the time-dependent covariates. However, such an empirical linear covariate model may be inappropriate for the (unobserved) censored covariate values that may behave quite differently from the observed covariate process. In applications such as human immunodeficiency virus–acquired immune deficiency syndrome studies, a mechanistic non-linear model can be derived for the covariate process on the basis of the underlying data generation mechanisms and such a non-linear covariate model may provide better ‘predictions’ for the censored and mismeasured covariate values. We propose a joint Cox and non-linear mixed effect model to model survival data with censored and mismeasured time varying covariates. We use likelihood methods for inference, implemented by the Monte Carlo EM algorithm. The models and methods are evaluated by simulations. An acquired immune deficiency syndrome data set is analysed in detail, where the time-dependent covariate is a viral load which may be censored because of a lower detection limit and may also be measured with errors. The results based on linear and non-linear covariate models are compared and new insights are gained.

     
    more » « less
  3. Summary

    In many observational longitudinal studies, the outcome of interest presents a skewed distribution, is subject to censoring due to detection limit or other reasons, and is observed at irregular times that may follow a outcome-dependent pattern. In this work, we consider quantile regression modeling of such longitudinal data, because quantile regression is generally robust in handling skewed and censored outcomes and is flexible to accommodate dynamic covariate-outcome relationships. Specifically, we study a longitudinal quantile regression model that specifies covariate effects on the marginal quantiles of the longitudinal outcome. Such a model is easy to interpret and can accommodate dynamic outcome profile changes over time. We propose estimation and inference procedures that can appropriately account for censoring and irregular outcome-dependent follow-up. Our proposals can be readily implemented based on existing software for quantile regression. We establish the asymptotic properties of the proposed estimator, including uniform consistency and weak convergence. Extensive simulations suggest good finite-sample performance of the new method. We also present an analysis of data from a long-term study of a population exposed to polybrominated biphenyls (PBB), which uncovers an inhomogeneous PBB elimination pattern that would not be detected by traditional longitudinal data analysis.

     
    more » « less
  4. Summary

    In a regression model, the joint distribution for each finite sample of units is determined by a function px(y) depending only on the list of covariate values x=(x(u1),…,x(un)) on the sampled units. No random sampling of units is involved. In biological work, random sampling is frequently unavoidable, in which case the joint distribution p(y,x) depends on the sampling scheme. Regression models can be used for the study of dependence provided that the conditional distribution p(y|x) for random samples agrees with px(y) as determined by the regression model for a fixed sample having a non-random configuration x. The paper develops a model that avoids the concept of a fixed population of units, thereby forcing the sampling plan to be incorporated into the sampling distribution. For a quota sample having a predetermined covariate configuration x, the sampling distribution agrees with the standard logistic regression model with correlated components. For most natural sampling plans such as sequential or simple random sampling, the conditional distribution p(y|x) is not the same as the regression distribution unless px(y) has independent components. In this sense, most natural sampling schemes involving binary random-effects models are biased. The implications of this formulation for subject-specific and population-averaged procedures are explored.

     
    more » « less
  5. Abstract

    With advances in biomedical research, biomarkers are becoming increasingly important prognostic factors for predicting overall survival, while the measurement of biomarkers is often censored due to instruments' lower limits of detection. This leads to two types of censoring: random censoring in overall survival outcomes and fixed censoring in biomarker covariates, posing new challenges in statistical modeling and inference. Existing methods for analyzing such data focus primarily on linear regression ignoring censored responses or semiparametric accelerated failure time models with covariates under detection limits (DL). In this paper, we propose a quantile regression for survival data with covariates subject to DL. Comparing to existing methods, the proposed approach provides a more versatile tool for modeling the distribution of survival outcomes by allowing covariate effects to vary across conditional quantiles of the survival time and requiring no parametric distribution assumptions for outcome data. To estimate the quantile process of regression coefficients, we develop a novel multiple imputation approach based on another quantile regression for covariates under DL, avoiding stringent parametric restrictions on censored covariates as often assumed in the literature. Under regularity conditions, we show that the estimation procedure yields uniformly consistent and asymptotically normal estimators. Simulation results demonstrate the satisfactory finite‐sample performance of the method. We also apply our method to the motivating data from a study of genetic and inflammatory markers of Sepsis.

     
    more » « less