skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Friday, December 13 until 2:00 AM ET on Saturday, December 14 due to maintenance. We apologize for the inconvenience.


Title: High‐dimensional single‐index models with censored responses

In this article, we study the estimation of high‐dimensional single index models when the response variable is censored. We hybrid the estimation methods for high‐dimensional single‐index models (but without censorship) and univariate nonparametric models with randomly censored responses to estimate the index parameters and the link function and apply the proposed methods to analyze a genomic dataset from a study of diffuse large B‐cell lymphoma. We evaluate the finite sample performance of the proposed procedures via simulation studies and establish large sample theories for the proposed estimators of the index parameter and the nonparametric link function under certain regularity conditions.

 
more » « less
PAR ID:
10376646
Author(s) / Creator(s):
 ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Statistics in Medicine
Volume:
39
Issue:
21
ISSN:
0277-6715
Page Range / eLocation ID:
p. 2743-2754
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. To better fit the actual data, this paper will consider both spatio-temporal correlation and heterogeneity to build the model. In order to overcome the “curse of dimensionality” problem in the nonparametric method, we improve the estimation method of the single-index model and combine it with the correlation and heterogeneity of the spatio-temporal model to obtain a good estimation method. In this paper, assuming that the spatio-temporal process obeys the α mixing condition, a nonparametric procedure is developed for estimating the variance function based on a fully nonparametric function or dimensional reduction structure, and the resulting estimator is consistent. Then, a reweighting estimation of the parametric component can be obtained via taking the estimated variance function into account. The rate of convergence and the asymptotic normality of the new estimators are established under mild conditions. Simulation studies are conducted to evaluate the efficacy of the proposed methodologies, and a case study about the estimation of the air quality evaluation index in Nanjing is provided for illustration.

     
    more » « less
  2. Abstract

    Data integration combining a probability sample with another nonprobability sample is an emerging area of research in survey sampling. We consider the case when the study variable of interest is measured only in the nonprobability sample, but comparable auxiliary information is available for both data sources. We consider mass imputation for the probability sample using the nonprobability data as the training set for imputation. The parametric mass imputation is sensitive to parametric model assumptions. To develop improved and robust methods, we consider nonparametric mass imputation for data integration. In particular, we consider kernel smoothing for a low-dimensional covariate and generalized additive models for a relatively high-dimensional covariate for imputation. Asymptotic theories and variance estimation are developed. Simulation studies and real applications show the benefits of our proposed methods over parametric counterparts.

     
    more » « less
  3. Summary

    We propose a class of semiparametric functional regression models to describe the influence of vector-valued covariates on a sample of response curves. Each observed curve is viewed as the realization of a random process, composed of an overall mean function and random components. The finite dimensional covariates influence the random components of the eigenfunction expansion through single-index models that include unknown smooth link and variance functions. The parametric components of the single-index models are estimated via quasi-score estimating equations with link and variance functions being estimated nonparametrically. We obtain several basic asymptotic results. The functional regression models proposed are illustrated with the analysis of a data set consisting of egg laying curves for 1000 female Mediterranean fruit-flies (medflies).

     
    more » « less
  4. Abstract

    Statistical analysis of longitudinal data often involves modeling treatment effects on clinically relevant longitudinal biomarkers since an initial event (the time origin). In some studies including preventive HIV vaccine efficacy trials, some participants have biomarkers measured starting at the time origin, whereas others have biomarkers measured starting later with the time origin unknown. The semiparametric additive time-varying coefficient model is investigated where the effects of some covariates vary nonparametrically with time while the effects of others remain constant. Weighted profile least squares estimators coupled with kernel smoothing are developed. The method uses the expectation maximization approach to deal with the censored time origin. The Kaplan–Meier estimator and other failure time regression models such as the Cox model can be utilized to estimate the distribution and the conditional distribution of left censored event time related to the censored time origin. Asymptotic properties of the parametric and nonparametric estimators and consistent asymptotic variance estimators are derived. A two-stage estimation procedure for choosing weight is proposed to improve estimation efficiency. Numerical simulations are conducted to examine finite sample properties of the proposed estimators. The simulation results show that the theory and methods work well. The efficiency gain of the two-stage estimation procedure depends on the distribution of the longitudinal error processes. The method is applied to analyze data from the Merck 023/HVTN 502 Step HIV vaccine study.

     
    more » « less
  5. Failure time data subject to various types of censoring commonly arise in epidemiological and biomedical studies. Motivated by an AIDS clinical trial, we consider regression analysis of failure time data that include exact and left‐, interval‐, and/or right‐censored observations, which are often referred to as partly interval‐censored failure time data. We study the effects of potentially time‐dependent covariates on partly interval‐censored failure time via a class of semiparametric transformation models that includes the widely used proportional hazards model and the proportional odds model as special cases. We propose an EM algorithm for the nonparametric maximum likelihood estimation and show that it unifies some existing approaches developed for traditional right‐censored data or purely interval‐censored data. In particular, the proposed method reduces to the partial likelihood approach in the case of right‐censored data under the proportional hazards model. We establish that the resulting estimator is consistent and asymptotically normal. In addition, we investigate the proposed method via simulation studies and apply it to the motivating AIDS clinical trial.

     
    more » « less