Abstract Mendelian Randomization (MR) studies are threatened by population stratification, batch effects, and horizontal pleiotropy. Although a variety of methods have been proposed to mitigate those problems, residual biases may still remain, leading to highly statistically significant false positives in large databases. Here we describe a suite of sensitivity analysis tools that enables investigators to quantify the robustness of their findings against such validity threats. Specifically, we propose the routine reporting of sensitivity statistics that reveal the minimal strength of violations necessary to explain away the MR results. We further provide intuitive displays of the robustness of the MR estimate to any degree of violation, and formal bounds on the worst-case bias caused by violations multiple times stronger than observed variables. We demonstrate how these tools can aid researchers in distinguishing robust from fragile findings by examining the effect of body mass index on diastolic blood pressure and Townsend deprivation index.
more »
« less
Leveraging family data to design Mendelian Randomization that is provably robust to population stratification
Mendelian Randomization (MR) has emerged as a powerful approach to leverage genetic instruments to infer causality between pairs of traits in observational studies. However, the results of such studies are susceptible to biases due to weak instruments as well as the confounding effects of population stratification and horizontal pleiotropy. Here, we show that family data can be leveraged to design MR tests that are provably robust to confounding from population stratification, assortative mating, and dynastic effects. We demonstrate in simulations that our approach, MR-Twin, is robust to confounding from population stratification and is not affected by weak instrument bias, while standard MR methods yield inflated false positive rates. We then conducted an exploratory analysis of MR-Twin and other MR methods applied to 121 trait pairs in the UK Biobank dataset. Our results suggest that confounding from population stratification can lead to false positives for existing MR methods, while MR-Twin is immune to this type of confounding, and that MR-Twin can help assess whether traditional approaches may be inflated due to confounding from population stratification.
more »
« less
- PAR ID:
- 10423502
- Date Published:
- Journal Name:
- Genome Research
- ISSN:
- 1088-9051
- Page Range / eLocation ID:
- gr.277664.123
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Mendelian randomization (MR) has been a popular method in genetic epidemiology to estimate the effect of an exposure on an outcome using genetic variants as instrumental variables (IV), with two‐sample summary‐data MR being the most popular. Unfortunately, instruments in MR studies are often weakly associated with the exposure, which can bias effect estimates and inflate Type I errors. In this work, we propose test statistics that are robust under weak‐instrument asymptotics by extending the Anderson–Rubin, Kleibergen, and the conditional likelihood ratio test in econometrics to two‐sample summary‐data MR. We also use the proposed Anderson–Rubin test to develop a point estimator and to detect invalid instruments. We conclude with a simulation and an empirical study and show that the proposed tests control size and have better power than existing methods with weak instruments.more » « less
-
Abstract Integrating results from genome-wide association studies (GWASs) and gene expression studies through transcriptome-wide association study (TWAS) has the potential to shed light on the causal molecular mechanisms underlying disease etiology. Here, we present a probabilistic Mendelian randomization (MR) method, PMR-Egger, for TWAS applications. PMR-Egger relies on a MR likelihood framework that unifies many existing TWAS and MR methods, accommodates multiple correlated instruments, tests the causal effect of gene on trait in the presence of horizontal pleiotropy, and is scalable to hundreds of thousands of individuals. In simulations, PMR-Egger provides calibrated type I error control for causal effect testing in the presence of horizontal pleiotropic effects, is reasonably robust under various types of model misspecifications, is more powerful than existing TWAS/MR approaches, and can directly test for horizontal pleiotropy. We illustrate the benefits of PMR-Egger in applications to 39 diseases and complex traits obtained from three GWASs including the UK Biobank.more » « less
-
Machine learning (ML) methods for causal inference have gained popularity due to their flexibility to predict the outcome model and the propensity score. In this article, we provide a within-group approach for ML-based causal inference methods in order to robustly estimate average treatment effects in multilevel studies when there is cluster-level unmeasured confounding. We focus on one particular ML-based causal inference method based on the targeted maximum likelihood estimation (TMLE) with an ensemble learner called SuperLearner. Through our simulation studies, we observe that training TMLE within groups of similar clusters helps remove bias from cluster-level unmeasured confounders. Also, using within-group propensity scores estimated from fixed effects logistic regression increases the robustness of the proposed within-group TMLE method. Even if the propensity scores are partially misspecified, the within-group TMLE still produces robust ATE estimates due to double robustness with flexible modeling, unlike parametric-based inverse propensity weighting methods. We demonstrate our proposed methods and conduct sensitivity analyses against the number of groups and individual-level unmeasured confounding to evaluate the effect of taking an eighth-grade algebra course on math achievement in the Early Childhood Longitudinal Study.more » « less
-
Background: Outcome measures that are count variables with excessive zeros are common in health behaviors research. Examples include the number of standard drinks consumed or alcohol‐related problems experienced over time. There is a lack of empirical data about the relative performance of prevailing statistical models for assessing the efficacy of interventions when outcomes are zero‐inflated, particularly compared with recently developed marginalized count regression approaches for such data.Methods: The current simulation study examined five commonly used approaches for analyzing count outcomes, including two linear models (with outcomes on raw and log‐transformed scales, respectively) and three prevailing count distribution‐based models (ie, Poisson, negative binomial, and zero‐inflated Poisson (ZIP) models). We also considered the marginalized zero‐inflated Poisson (MZIP) model, a novel alternative that estimates the overall effects on the population mean while adjusting for zero‐inflation. Motivated by alcohol misuse prevention trials, extensive simulations were conducted to evaluate and compare the statistical power and Type I error rate of the statistical models and approaches across data conditions that varied in sample size ( to 500), zero rate (0.2 to 0.8), and intervention effect sizes.Results: Under zero‐inflation, the Poisson model failed to control the Type I error rate, resulting in higher than expected false positive results. When the intervention effects on the zero (vs. non‐zero) and count parts were in the same direction, the MZIP model had the highest statistical power, followed by the linear model with outcomes on the raw scale, negative binomial model, and ZIP model. The performance of the linear model with a log‐transformed outcome variable was unsatisfactory.Conclusions: The MZIP model demonstrated better statistical properties in detecting true intervention effects and controlling false positive results for zero‐inflated count outcomes. This MZIP model may serve as an appealing analytical approach to evaluating overall intervention effects in studies with count outcomes marked by excessive zeros.more » « less
An official website of the United States government

