NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Improving Trial Generalizability Using Observational Studies

https://doi.org/10.1111/biom.13609

Lee, Dasom; Yang, Shu; Dong, Lin; Wang, Xiaofei; Zeng, Donglin; Cai, Jianwen (December 2021, Biometrics)

Abstract Complementary features of randomized controlled trials (RCTs) and observational studies (OSs) can be used jointly to estimate the average treatment effect of a target population. We propose a calibration weighting estimator that enforces the covariate balance between the RCT and OS, therefore improving the trial-based estimator's generalizability. Exploiting semiparametric efficiency theory, we propose a doubly robust augmented calibration weighting estimator that achieves the efficiency bound derived under the identification assumptions. A nonparametric sieve method is provided as an alternative to the parametric approach, which enables the robust approximation of the nuisance functions and data-adaptive selection of outcome predictors for calibration. We establish asymptotic results and confirm the finite sample performances of the proposed estimators by simulation experiments and an application on the estimation of the treatment effect of adjuvant chemotherapy for early-stage non-small-cell lung patients after surgery.
more » « less
SMIM: A unified framework of survival sensitivity analysis using multiple imputation and martingale

https://doi.org/10.1111/biom.13555

Yang, Shu; Zhang, Yilong; Liu, Guanghan Frank; Guan, Qian (September 2021, Biometrics)

Abstract Censored survival data are common in clinical trial studies. We propose a unified framework for sensitivity analysis to censoring at random in survival data using multiple imputation and martingale, called SMIM. The proposed framework adopts the δ‐adjusted and control‐based models, indexed by the sensitivity parameter, entailing censoring at random and a wide collection of censoring not at random assumptions. Also, it targets a broad class of treatment effect estimands defined as functionals of treatment‐specific survival functions, taking into account missing data due to censoring. Multiple imputation facilitates the use of simple full‐sample estimation; however, the standard Rubin's combining rule may overestimate the variance for inference in the sensitivity analysis framework. We decompose the multiple imputation estimator into a martingale series based on the sequential construction of the estimator and propose the wild bootstrap inference by resampling the martingale series. The new bootstrap inference has a theoretical guarantee for consistency and is computationally efficient compared to the nonparametric bootstrap counterpart. We evaluate the finite‐sample performance of the proposed SMIM through simulation and an application on an HIV clinical trial.
more » « less
Nonparametric Mass Imputation for Data Integration

https://doi.org/10.1093/jssam/smaa036

Chen, Sixia; Yang, Shu; Kim, Jae Kwang (November 2020, Journal of Survey Statistics and Methodology)

Abstract Data integration combining a probability sample with another nonprobability sample is an emerging area of research in survey sampling. We consider the case when the study variable of interest is measured only in the nonprobability sample, but comparable auxiliary information is available for both data sources. We consider mass imputation for the probability sample using the nonprobability data as the training set for imputation. The parametric mass imputation is sensitive to parametric model assumptions. To develop improved and robust methods, we consider nonparametric mass imputation for data integration. In particular, we consider kernel smoothing for a low-dimensional covariate and generalized additive models for a relatively high-dimensional covariate for imputation. Asymptotic theories and variance estimation are developed. Simulation studies and real applications show the benefits of our proposed methods over parametric counterparts.
more » « less
Semiparametric estimation of structural nested mean models with irregularly spaced longitudinal observations

https://doi.org/10.1111/biom.13471

Yang, Shu (April 2021, Biometrics)

Abstract Structural nested mean models (SNMMs) are useful for causal inference of treatment effects in longitudinal observational studies. Most existing works assume that the data are collected at prefixed time points for all subjects, which, however, may be restrictive in practice. To deal with irregularly spaced observations, we assume a class of continuous‐time SNMMs and a martingale condition of no unmeasured confounding (NUC) to identify the causal parameters. We develop the semiparametric efficiency theory and locally efficient estimators for continuous‐time SNMMs. This task is nontrivial due to the restrictions from the NUC assumption imposed on the SNMM parameter. In the presence of ignorable censoring, we show that the complete‐case estimator is optimal among a class of weighting estimators including the inverse probability of censoring weighting estimator, and it achieves a double robustness feature in that it is consistent if at least one of the models for the potential outcome mean function and the treatment process is correctly specified. The new framework allows us to conduct causal analysis respecting the underlying continuous‐time nature of data processes. The simulation study shows that the proposed estimator outperforms existing approaches. We estimate the effect of time to initiate highly active antiretroviral therapy on the CD4 count at year 2 from the observational Acute Infection and Early Disease Research Program database.
more » « less
Robust Inference of Conditional Average Treatment Effects Using Dimension Reduction

https://doi.org/10.5705/ss.202020.0409

Huang, Ming-Yueh; Yang, Shu (January 2023, Statistica Sinica)
null (Ed.)
Full Text Available
Practical recommendations on double score matching for estimating causal effects

https://doi.org/10.1002/sim.9289

Zhang, Yunshu; Yang, Shu; Ye, Wenyu; Faries, Douglas E.; Lipkovich, Ilya; Kadziola, Zbigniew (April 2022, Statistics in Medicine)

Full Text Available
Nearest neighbour ratio imputation with incomplete multinomial outcome in survey sampling

https://doi.org/10.1111/rssa.12841

Gao, Chenyin; Thompson, Katherine Jenny; Kim, Jae Kwang; Yang, Shu (January 2022, Journal of the Royal Statistical Society: Series A (Statistics in Society))

Full Text Available
Multiply robust matching estimators of average and quantile treatment effects

https://doi.org/10.1111/sjos.12585

Yang, Shu; Zhang, Yunshu (January 2022, Scandinavian Journal of Statistics)

Full Text Available
Discussion on “Spatial+: A novel approach to spatial confounding” by Dupont, Wood, and Augustin

https://doi.org/10.1111/biom.13651

Reich, Brian J.; Yang, Shu; Guan, Yawen (January 2022, Biometrics)

Full Text Available
Propensity Score Modeling in Electronic Health Records with Time-to-Event Endpoints: Application to Kidney Transplantation

https://doi.org/10.6339/22-JDS1046

Yu, Jonathan W.; Bandyopadhyay, Dipankar; Yang, Shu; Kang, Le; Gupta, Gaurav (January 2022, Journal of Data Science)

For large observational studies lacking a control group (unlike randomized controlled trials, RCT), propensity scores (PS) are often the method of choice to account for pre-treatment confounding in baseline characteristics, and thereby avoid substantial bias in treatment estimation. A vast majority of PS techniques focus on average treatment effect estimation, without any clear consensus on how to account for confounders, especially in a multiple treatment setting. Furthermore, for time-to event outcomes, the analytical framework is further complicated in presence of high censoring rates (sometimes, due to non-susceptibility of study units to a disease), imbalance between treatment groups, and clustered nature of the data (where, survival outcomes appear in groups). Motivated by a right-censored kidney transplantation dataset derived from the United Network of Organ Sharing (UNOS), we investigate and compare two recent promising PS procedures, (a) the generalized boosted model (GBM), and (b) the covariate-balancing propensity score (CBPS), in an attempt to decouple the causal effects of treatments (here, study subgroups, such as hepatitis C virus (HCV) positive/negative donors, and positive/negative recipients) on time to death of kidney recipients due to kidney failure, post transplantation. For estimation, we employ a 2-step procedure which addresses various complexities observed in the UNOS database within a unified paradigm. First, to adjust for the large number of confounders on the multiple sub-groups, we fit multinomial PS models via procedures (a) and (b). In the next stage, the estimated PS is incorporated into the likelihood of a semi-parametric cure rate Cox proportional hazard frailty model via inverse probability of treatment weighting, adjusted for multi-center clustering and excess censoring, Our data analysis reveals a more informative and superior performance of the full model in terms of treatment effect estimation, over sub-models that relaxes the various features of the event time dataset.
more » « less
Full Text Available

« Prev Next »

Search for: All records