skip to main content


Title: Causal inference for multiple treatments using fractional factorial designs
Abstract

We consider the design and analysis of multi‐factor experiments using fractional factorial and incomplete designs within the potential outcome framework. These designs are particularly useful when limited resources make running a full factorial design infeasible. We connect our design‐based methods to standard regression methods. We further motivate the usefulness of these designs in multi‐factor observational studies, where certain treatment combinations may be so rare that there are no measured outcomes in the observed data corresponding to them. Therefore, conceptualizing a hypothetical fractional factorial experiment instead of a full factorial experiment allows for appropriate analysis in those settings. We illustrate our approach using biomedical data from the 2003–2004 cycle of the National Health and Nutrition Examination Survey to examine the effects of four common pesticides on body mass index.

 
more » « less
PAR ID:
10411789
Author(s) / Creator(s):
 ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Canadian Journal of Statistics
Volume:
51
Issue:
2
ISSN:
0319-5724
Page Range / eLocation ID:
p. 444-468
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Discrete choice experiments (DCEs) are increasingly used for studying and quantifying subjects preferences in a wide variety of healthcare applications. They provide a rich source of data to assess real‐life decision‐making processes, which involve trade‐offs between desirable characteristics pertaining to health and healthcare and identification of key attributes affecting healthcare. The choice of the design for a DCE is critical because it determines which attributes' effects and their interactions are identifiable. We apply blocked fractional factorial designs to construct DCEs and address some identification issues by utilizing the known structure of blocked fractional factorial designs. Our design techniques can be applied to several situations including DCEs where attributes have different number of levels. We demonstrate our design methodology using two healthcare studies to evaluate (i) asthma patients' preferences for symptom‐based outcome measures and (ii) patient preference for breast screening services. Copyright © 2016 John Wiley & Sons, Ltd.

     
    more » « less
  2. For decades, multiple-driver/stressor research has examined interactions among drivers that will undergo large changes in the future: temperature, pH, nutrients, oxygen, pathogens, and more. However, the most commonly used experimental designs—present-versus-future and ANOVA—fail to contribute to general understanding or predictive power. Linking experimental design to process-based mathematical models would help us predict how ecosystems will behave in novel environmental conditions. We review a range of experimental designs and assess the best experimental path toward a predictive ecology. Full factorial response surface, fractional factorial, quadratic response surface, custom, space-filling, and especially optimal and sequential/adaptive designs can help us achieve more valuable scientific goals. Experiments using these designs are challenging to perform with long-lived organisms or at the community and ecosystem levels. But they remain our most promising path toward linking experiments and theory in multiple-driver research and making accurate, useful predictions.

     
    more » « less
  3. Summary Factorial designs are widely used because of their ability to accommodate multiple factors simultaneously. Factor-based regression with main effects and some interactions is the dominant strategy for downstream analysis, delivering point estimators and standard errors simultaneously via one least-squares fit. Justification of these convenient estimators from the design-based perspective requires quantifying their sampling properties under the assignment mechanism while conditioning on the potential outcomes. To this end, we derive the sampling properties of the regression estimators under a wide range of specifications, and establish the appropriateness of the corresponding robust standard errors for Wald-type inference. The results help to clarify the causal interpretation of the coefficients in these factor-based regressions, and motivate the definition of general factorial effects to unify the definitions of factorial effects in various fields. We also quantify the bias-variance trade-off between the saturated and unsaturated regressions from the design-based perspective. 
    more » « less
  4. Abstract

    The application of single-cell RNA sequencing (scRNAseq) for the evaluation of chemicals, drugs, and food contaminants presents the opportunity to consider cellular heterogeneity in pharmacological and toxicological responses. Current differential gene expression analysis (DGEA) methods focus primarily on two group comparisons, not multi-group dose–response study designs used in safety assessments. To benchmark DGEA methods for dose–response scRNAseq experiments, we proposed a multiplicity corrected Bayesian testing approach and compare it against 8 other methods including two frequentist fit-for-purpose tests using simulated and experimental data. Our Bayesian test method outperformed all other tests for a broad range of accuracy metrics including control of false positive error rates. Most notable, the fit-for-purpose and standard multiple group DGEA methods were superior to the two group scRNAseq methods for dose–response study designs. Collectively, our benchmarking of DGEA methods demonstrates the importance in considering study design when determining the most appropriate test methods.

     
    more » « less
  5. Abstract

    Optimizing the allocation of units into treatment groups can help researchers improve the precision of causal estimators and decrease costs when running factorial experiments. However, existing optimal allocation results typically assume a super-population model and that the outcome data come from a known family of distributions. Instead, we focus on randomization-based causal inference for the finite-population setting, which does not require model specifications for the data or sampling assumptions. We propose exact theoretical solutions for optimal allocation in2K{2}^{K}factorial experiments under complete randomization with A-, D-, and E-optimality criteria. We then extend this work to factorial designs with block randomization. We also derive results for optimal allocations when using cost-based constraints. To connect our theory to practice, we provide convenient integer-constrained programming solutions using a greedy optimization approach to find integer optimal allocation solutions for both complete and block randomizations. The proposed methods are demonstrated using two real-life factorial experiments conducted by social scientists.

     
    more » « less