We propose an efficient estimator for the coefficients in censored quantile regression using the envelope model. The envelope model uses dimension reduction techniques to identify material and immaterial components in the data, and forms the estimator based only on the material component, thus reducing the variability of estimation. We will demonstrate the guaranteed asymptotic efficiency gain of our proposed envelope estimator over the traditional estimator for censored quantile regression. Our analysis begins with the local weighing approach that traditionally relies on semiparametric ‐estimation involving the conditional Kaplan–Meier estimator. We will instead invoke the independent identically distributed (i.i.d.) representation of the Kaplan–Meier estimator, which eliminates this infinite‐dimensional nuisance and transforms our objective function in ‐estimation into a ‐process indexed by only an Euclidean parameter. The modified ‐estimation problem becomes entirely parametric and hence more amenable to analysis. We will also reconsider the i.i.d. representation of the conditional Kaplan–Meier estimator.
- 10186571
- Date Published:
- Journal Name:
- Biometrika
- 0006-3444
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
Abstract -
High‐dimensional data with censored outcomes of interest are prevalent in medical research. To analyze such data, the regularized Buckley–James estimator has been successfully applied to build accurate predictive models and conduct variable selection. In this paper, we consider the problem of parameter estimation and variable selection for the semiparametric accelerated failure time model for high‐dimensional block‐missing multimodal neuroimaging data with censored outcomes. We propose a penalized Buckley–James method that can simultaneously handle block‐wise missing covariates and censored outcomes. This method can also perform variable selection. The proposed method is evaluated by simulations and applied to a multimodal neuroimaging dataset and obtains meaningful results.
Abstract Popular parametric and semiparametric hazards regression models for clustered survival data are inappropriate and inadequate when the unknown effects of different covariates and clustering are complex. This calls for a flexible modeling framework to yield efficient survival prediction. Moreover, for some survival studies involving time to occurrence of some asymptomatic events, survival times are typically interval censored between consecutive clinical inspections. In this article, we propose a robust semiparametric model for clustered interval‐censored survival data under a paradigm of Bayesian ensemble learning, called soft Bayesian additive regression trees or SBART (Linero and Yang, 2018), which combines multiple sparse (soft) decision trees to attain excellent predictive accuracy. We develop a novel semiparametric hazards regression model by modeling the hazard function as a product of a parametric baseline hazard function and a nonparametric component that uses SBART to incorporate clustering, unknown functional forms of the main effects, and interaction effects of various covariates. In addition to being applicable for left‐censored, right‐censored, and interval‐censored survival data, our methodology is implemented using a data augmentation scheme which allows for existing Bayesian backfitting algorithms to be used. We illustrate the practical implementation and advantages of our method via simulation studies and an analysis of a prostate cancer surgery study where dependence on the experience and skill level of the physicians leads to clustering of survival times. We conclude by discussing our method's applicability in studies involving high‐dimensional data with complex underlying associations.
Failure time data subject to various types of censoring commonly arise in epidemiological and biomedical studies. Motivated by an AIDS clinical trial, we consider regression analysis of failure time data that include exact and left‐, interval‐, and/or right‐censored observations, which are often referred to as partly interval‐censored failure time data. We study the effects of potentially time‐dependent covariates on partly interval‐censored failure time via a class of semiparametric transformation models that includes the widely used proportional hazards model and the proportional odds model as special cases. We propose an EM algorithm for the nonparametric maximum likelihood estimation and show that it unifies some existing approaches developed for traditional right‐censored data or purely interval‐censored data. In particular, the proposed method reduces to the partial likelihood approach in the case of right‐censored data under the proportional hazards model. We establish that the resulting estimator is consistent and asymptotically normal. In addition, we investigate the proposed method via simulation studies and apply it to the motivating AIDS clinical trial.
Abstract In prevalent cohort studies where subjects are recruited at a cross‐section, the time to an event may be subject to length‐biased sampling, with the observed data being either the forward recurrence time, or the backward recurrence time, or their sum. In the regression setting, assuming a semiparametric accelerated failure time model for the underlying event time, where the intercept parameter is absorbed into the nuisance parameter, it has been shown that the model remains invariant under these observed data setups and can be fitted using standard methodology for accelerated failure time model estimation, ignoring the length bias. However, the efficiency of these estimators is unclear, owing to the fact that the observed covariate distribution, which is also length biased, may contain information about the regression parameter in the accelerated life model. We demonstrate that if the true covariate distribution is completely unspecified, then the naive estimator based on the conditional likelihood given the covariates is fully efficient for the slope.