Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
In the absence of data from a randomized trial, researchers may aim to use observational data to draw causal inference about the effect of a treatment on a time-to-event outcome. In this context, interest often focuses on the treatment-specific survival curves, that is, the survival curves were the population under study to be assigned to receive the treatment or not. Under certain conditions, including that all confounders of the treatment-outcome relationship are observed, the treatment-specific survival curve can be identified with a covariate-adjusted survival curve. In this article, we propose a novel cross-fitted doubly-robust estimator that incorporates data-adaptive (e.g. machine learning) estimators of the conditional survival functions. We establish conditions on the nuisance estimators under which our estimator is consistent and asymptotically linear, both pointwise and uniformly in time. We also propose a novel ensemble learner for combining multiple candidate estimators of the conditional survival estimators. Notably, our methods and results accommodate events occurring in discrete or continuous time, or an arbitrary mix of the two. We investigate the practical performance of our methods using numerical studies and an application to the effect of a surgical treatment to prevent metastases of parotid carcinoma on mortality.more » « lessFree, publicly-accessible full text available April 2, 2025
Abstract We propose a broad class of so-called Cox–Aalen transformation models that incorporate both multiplicative and additive covariate effects on the baseline hazard function within a transformation. The proposed models provide a highly flexible and versatile class of semiparametric models that include the transformation models and the Cox–Aalen model as special cases. Specifically, it extends the transformation models by allowing potentially time-dependent covariates to work additively on the baseline hazard and extends the Cox–Aalen model through a predetermined transformation function. We propose an estimating equation approach and devise an expectation-solving (ES) algorithm that involves fast and robust calculations. The resulting estimator is shown to be consistent and asymptotically normal via modern empirical process techniques. The ES algorithm yields a computationally simple method for estimating the variance of both parametric and nonparametric estimators. Finally, we demonstrate the performance of our procedures through extensive simulation studies and applications in two randomized, placebo-controlled human immunodeficiency virus (HIV) prevention efficacy trials. The data example shows the utility of the proposed Cox–Aalen transformation models in enhancing statistical power for discovering covariate effects.
In the CYD14 trial of the CYD-TDV dengue vaccine in 2–14 year-olds, neutralizing antibody (nAb) titers to the vaccine-insert dengue strains correlated inversely with symptomatic, virologically-confirmed dengue (VCD). Also, vaccine efficacy against VCD was higher against dengue prM/E amino acid sequences closer to the vaccine inserts. We integrated the nAb and sequence data types by assessing nAb titers as a correlate of sequence-specific VCD separately in the vaccine arm and in the placebo arm. In both vaccine and placebo recipients the correlation of nAb titer with sequence-specific VCD was stronger for dengue nAb contact site sequences closer to the vaccine (p = 0.005 and p = 0.012, respectively). The risk of VCD in vaccine (placebo) recipients was 6.7- (1.80)-fold lower at the 90th vs 10th percentile of nAb for viruses perfectly matched to CYD-TDV, compared to 2.1- (0.78)-fold lower at the 90th vs 10th percentile for viruses with five amino acid mismatches. The evidence for a stronger sequence-distance dependent correlate of risk for the vaccine arm indicates departure from the Prentice criteria for a valid sequence-distance specific surrogate endpoint and suggests that the nAb marker may affect dengue risk differently depending on whether nAbs arise from infection or also by vaccination. However, when restricting to baselineseropositive 9–14 year-olds, the correlation pattern became more similar between the vaccine and placebo arms, supporting nAb titers as an approximate surrogate endpoint in this population. No sequencespecific nAb titer correlates of VCD were seen in baseline-seronegative participants. Integrated immune response/pathogen sequence data correlates analyses could help increase knowledge of correlates of risk and surrogate endpoints for other vaccines against genetically diverse pathogens.more » « less
Abstract Statistical analysis of longitudinal data often involves modeling treatment effects on clinically relevant longitudinal biomarkers since an initial event (the time origin). In some studies including preventive HIV vaccine efficacy trials, some participants have biomarkers measured starting at the time origin, whereas others have biomarkers measured starting later with the time origin unknown. The semiparametric additive time-varying coefficient model is investigated where the effects of some covariates vary nonparametrically with time while the effects of others remain constant. Weighted profile least squares estimators coupled with kernel smoothing are developed. The method uses the expectation maximization approach to deal with the censored time origin. The Kaplan–Meier estimator and other failure time regression models such as the Cox model can be utilized to estimate the distribution and the conditional distribution of left censored event time related to the censored time origin. Asymptotic properties of the parametric and nonparametric estimators and consistent asymptotic variance estimators are derived. A two-stage estimation procedure for choosing weight is proposed to improve estimation efficiency. Numerical simulations are conducted to examine finite sample properties of the proposed estimators. The simulation results show that the theory and methods work well. The efficiency gain of the two-stage estimation procedure depends on the distribution of the longitudinal error processes. The method is applied to analyze data from the Merck 023/HVTN 502 Step HIV vaccine study.
This paper studies theCox model with time-varying coefficients for cause-specific hazard functions when the causes of failure are subject to missingness. Inverse probability weighted and augmented inverse probability weighted estimators are investigated. The latter is considered as a two-stage estimator by directly utilizing the inverse probability weighted estimator and through modeling available auxiliary variables to improve efficiency. The asymptotic properties of the two estimators are investigated. Hypothesis testing procedures are developed to test the null hypotheses that the covariate effects are zero and that the covariate effects are constant. We conduct simulation studies to examine the finite sample properties of the proposed estimation and hypothesis testing procedures under various settings of the auxiliary variables and the percentages of the failure causes that are missing. These simulation results demonstrate that the augmented inverse probability weighted estimators are more efficient than the inverse probability weighted estimators and that the proposed testing procedures have the expected satisfactory results in sizes and powers. The proposed methods are illustrated using the Mashi clinical trial data for investigating the effect of randomization to formula-feeding versus breastfeeding plus extended infant zidovudine prophylaxis on death due to mother-to-child HIV transmission in Botswana.more » « less
Abstract This article presents generalized semiparametric regression models for conditional cumulative incidence functions with competing risks data when covariates are missing by sampling design or happenstance. A doubly robust augmented inverse probability weighted (AIPW) complete‐case approach to estimation and inference is investigated. This approach modifies IPW complete‐case estimating equations by exploiting the key features in the relationship between the missing covariates and the phase‐one data to improve efficiency. An iterative numerical procedure is derived to solve the nonlinear estimating equations. The asymptotic properties of the proposed estimators are established. A simulation study examining the finite‐sample performances of the proposed estimators shows that the AIPW estimators are more efficient than the IPW estimators. The developed method is applied to the RV144 HIV‐1 vaccine efficacy trial to investigate vaccine‐induced IgG binding antibodies to HIV‐1 as correlates of acquisition of HIV‐1 infection while taking account of whether the HIV‐1 sequences are near or far from the HIV‐1 sequences represented in the vaccine construct.
Failure time data subject to various types of censoring commonly arise in epidemiological and biomedical studies. Motivated by an AIDS clinical trial, we consider regression analysis of failure time data that include exact and left‐, interval‐, and/or right‐censored observations, which are often referred to as partly interval‐censored failure time data. We study the effects of potentially time‐dependent covariates on partly interval‐censored failure time via a class of semiparametric transformation models that includes the widely used proportional hazards model and the proportional odds model as special cases. We propose an EM algorithm for the nonparametric maximum likelihood estimation and show that it unifies some existing approaches developed for traditional right‐censored data or purely interval‐censored data. In particular, the proposed method reduces to the partial likelihood approach in the case of right‐censored data under the proportional hazards model. We establish that the resulting estimator is consistent and asymptotically normal. In addition, we investigate the proposed method via simulation studies and apply it to the motivating AIDS clinical trial.
Summary This article investigates a generalized semiparametric varying-coefficient model for longitudinal data that can flexibly model three types of covariate effects: time-constant effects, time-varying effects, and covariate-varying effects. Different link functions can be selected to provide a rich family of models for longitudinal data. The model assumes that the time-varying effects are unspecified functions of time and the covariate-varying effects are parametric functions of an exposure variable specified up to a finite number of unknown parameters. The estimation procedure is developed using local linear smoothing and profile weighted least squares estimation techniques. Hypothesis testing procedures are developed to test the parametric functions of the covariate-varying effects. The asymptotic distributions of the proposed estimators are established. A working formula for bandwidth selection is discussed and examined through simulations. Our simulation study shows that the proposed methods have satisfactory finite sample performance. The proposed methods are applied to the ACTG 244 clinical trial of HIV infected patients being treated with Zidovudine to examine the effects of antiretroviral treatment switching before and after HIV develops the T215Y/F drug resistance mutation. Our analysis shows benefits of treatment switching to the combination therapies as compared to continuing with ZDV monotherapy before and after developing the 215-mutation.