skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Thursday, October 10 until 2:00 AM ET on Friday, October 11 due to maintenance. We apologize for the inconvenience.


Search for: All records

Award ID contains: 1812048

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract

    The flexibility and wide applicability of the Fisher randomization test (FRT) make it an attractive tool for assessment of causal effects of interventions from modern-day randomized experiments that are increasing in size and complexity. This paper provides a theoretical inferential framework for FRT by establishing its connection with confidence distributions. Such a connection leads to development’s of (i) an unambiguous procedure for inversion of FRTs to generate confidence intervals with guaranteed coverage, (ii) new insights on the effect of size of the Monte Carlo sample on the estimation of a p-value curve and (iii) generic and specific methods to combine FRTs from multiple independent experiments with theoretical guarantees. Our developments pertain to finite sample settings but have direct extensions to large samples. Simulations and a case example demonstrate the benefit of these new developments.

     
    more » « less
  2. Summary

    Many clinical endpoint measures, such as the number of standard drinks consumed per week or the number of days that patients stayed in the hospital, are count data with excessive zeros. However, the zero‐inflated nature of such outcomes is sometimes ignored in analyses of clinical trials. This leads to biased estimates of study‐level intervention effect and, consequently, a biased estimate of the overall intervention effect in a meta‐analysis. The current study proposes a novel statistical approach, the Zero‐inflation Bias Correction (ZIBC) method, that can account for the bias introduced when using the Poisson regression model, despite a high rate of inflated zeros in the outcome distribution of a randomized clinical trial. This correction method only requires summary information from individual studies to correct intervention effect estimates as if they were appropriately estimated using the zero‐inflated Poisson regression model, thus it is attractive for meta‐analysis when individual participant‐level data are not available in some studies. Simulation studies and real data analyses showed that the ZIBC method performed well in correcting zero‐inflation bias in most situations.

     
    more » « less
  3. Abstract

    Multivariate failure time data are frequently analyzed using the marginal proportional hazards models and the frailty models. When the sample size is extraordinarily large, using either approach could face computational challenges. In this paper, we focus on the marginal model approach and propose a divide‐and‐combine method to analyze large‐scale multivariate failure time data. Our method is motivated by the Myocardial Infarction Data Acquisition System (MIDAS), a New Jersey statewide database that includes 73,725,160 admissions to nonfederal hospitals and emergency rooms (ERs) from 1995 to 2017. We propose to randomly divide the full data into multiple subsets and propose a weighted method to combine these estimators obtained from individual subsets using three weights. Under mild conditions, we show that the combined estimator is asymptotically equivalent to the estimator obtained from the full data as if the data were analyzed all at once. In addition, to screen out risk factors with weak signals, we propose to perform the regularized estimation on the combined estimator using its combined confidence distribution. Theoretical properties, such as consistency, oracle properties, and asymptotic equivalence between the divide‐and‐combine approach and the full data approach are studied. Performance of the proposed method is investigated using simulation studies. Our method is applied to the MIDAS data to identify risk factors related to multivariate cardiovascular‐related health outcomes.

     
    more » « less
  4. Abstract

    Fusion learning methods, developed for the purpose of analyzing datasets from many different sources, have become a popular research topic in recent years. Individualized inference approaches through fusion learning extend fusion learning approaches to individualized inference problems over a heterogeneous population, where similar individuals are fused together to enhance the inference over the target individual. Both classical fusion learning and individualized inference approaches through fusion learning are established based on weighted aggregation of individual information, but the weight used in the latter is localized to thetargetindividual. This article provides a review on two individualized inference methods through fusion learning,iFusion andiGroup, that are developed under different asymptotic settings. Both procedures guarantee optimal asymptotic theoretical performance and computational scalability.

    This article is categorized under:

    Statistical Learning and Exploratory Methods of the Data Sciences > Manifold Learning

    Statistical Learning and Exploratory Methods of the Data Sciences > Modeling Methods

    Statistical and Graphical Methods of Data Analysis > Nonparametric Methods

    Data: Types and Structure > Massive Data

     
    more » « less