NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Accurate modeling of replication rates in genome-wide association studies by accounting for Winner’s Curse and study-specific heterogeneity

https://doi.org/10.1093/g3journal/jkac261

Zou, Jennifer; Zhou, Jinjing; Faller, Sarah; Brown, Robert P.; Sankararaman, Sriram S.; Eskin, Eleazar; Matise, ed., T. (October 2022, G3 Genes|Genomes|Genetics)

Abstract Genome-wide association studies (GWAS) have identified thousands of genetic variants associated with complex human traits, but only a fraction of variants identified in discovery studies achieve significance in replication studies. Replication in genome-wide association studies has been well-studied in the context of Winner’s Curse, which is the inflation of effect size estimates for significant variants due to statistical chance. However, Winner’s Curse is often not sufficient to explain lack of replication. Another reason why studies fail to replicate is that there are fundamental differences between the discovery and replication studies. A confounding factor can create the appearance of a significant finding while actually being an artifact that will not replicate in future studies. We propose a statistical framework that utilizes genome-wide association studies and replication studies to jointly model Winner’s Curse and study-specific heterogeneity due to confounding factors. We apply this framework to 100 genome-wide association studies from the Human Genome-Wide Association Studies Catalog and observe that there is a large range in the level of estimated confounding. We demonstrate how this framework can be used to distinguish when studies fail to replicate due to statistical noise and when they fail due to confounding.
more » « less
Leveraging family data to design Mendelian Randomization that is provably robust to population stratification

https://doi.org/10.1101/gr.277664.123

LaPierre, Nathan; Fu, Boyang; Turnbull, Steven; Eskin, Eleazar; Sankararaman, Sriram (May 2023, Genome Research)

Mendelian Randomization (MR) has emerged as a powerful approach to leverage genetic instruments to infer causality between pairs of traits in observational studies. However, the results of such studies are susceptible to biases due to weak instruments as well as the confounding effects of population stratification and horizontal pleiotropy. Here, we show that family data can be leveraged to design MR tests that are provably robust to confounding from population stratification, assortative mating, and dynastic effects. We demonstrate in simulations that our approach, MR-Twin, is robust to confounding from population stratification and is not affected by weak instrument bias, while standard MR methods yield inflated false positive rates. We then conducted an exploratory analysis of MR-Twin and other MR methods applied to 121 trait pairs in the UK Biobank dataset. Our results suggest that confounding from population stratification can lead to false positives for existing MR methods, while MR-Twin is immune to this type of confounding, and that MR-Twin can help assess whether traditional approaches may be inflated due to confounding from population stratification.
more » « less
Full Text Available
Leveraging pleiotropy for joint analysis of genome-wide association studies with per trait interpretations

https://doi.org/10.1371/journal.pgen.1010447

Taraszka, Kodi; Zaitlen, Noah; Eskin, Eleazar (November 2022, PLOS Genetics)
Epstein, Michael P. (Ed.)
We introduce pleiotropic association test (PAT) for joint analysis of multiple traits using genome-wide association study (GWAS) summary statistics. The method utilizes the decomposition of phenotypic covariation into genetic and environmental components to create a likelihood ratio test statistic for each genetic variant. Though PAT does not directly interpret which trait(s) drive the association, a per trait interpretation of the omnibus p-value is provided through an extension to the meta-analysis framework, m-values. In simulations, we show PAT controls the false positive rate, increases statistical power, and is robust to model misspecifications of genetic effect. Additionally, simulations comparing PAT to three multi-trait methods, HIPO, MTAG, and ASSET, show PAT identified 15.3% more omnibus associations over the next best method. When these associations were interpreted on a per trait level using m-values, PAT had 37.5% more true per trait interpretations with a 0.92% false positive assignment rate. When analyzing four traits from the UK Biobank, PAT discovered 22,095 novel variants. Through the m-values interpretation framework, the number of per trait associations for two traits were almost tripled and were nearly doubled for another trait relative to the original single trait GWAS.
more » « less
Full Text Available
Bruins-in-Genomics: Evaluation of the impact of a UCLA undergraduate summer program in computational biology on participating students

https://doi.org/10.1371/journal.pone.0268861

Coller, Hilary A.; Beggs, Stacey; Andrews, Samantha; Maloy, Jeff; Chiu, Alec; Sankararaman, Sriram; Pellegrini, Matteo; Freimer, Nelson; Johnson, Tracy; Papp, Jeanette; et al (May 2022, PLOS ONE)
Kakulapati, Vijayalakshmi (Ed.)
Recruiting, training and retaining scientists in computational biology is necessary to develop a workforce that can lead the quantitative biology revolution. Yet, African-American/Black, Hispanic/Latinx, Native Americans, and women are severely underrepresented in computational biosciences. We established the UCLA Bruins-in-Genomics Summer Research Program to provide training and research experiences in quantitative biology and bioinformatics to undergraduate students with an emphasis on students from backgrounds underrepresented in computational biology. Program assessment was based on number of applicants, alumni surveys and comparison of post-graduate educational choices for participants and a control group of students who were accepted but declined to participate. We hypothesized that participation in the Bruins-in-Genomics program would increase the likelihood that students would pursue post-graduate education in a related field. Our surveys revealed that 75% of Bruins-in-Genomics Summer participants were enrolled in graduate school. Logistic regression analysis revealed that women who participated in the program were significantly more likely to pursue a Ph.D. than a matched control group (group x woman interaction term of p = 0 . 005 ). The Bruins-in-Genomics Summer program represents an example of how a combined didactic-research program structure can make computational biology accessible to a wide range of undergraduates and increase participation in quantitative biosciences.
more » « less
Full Text Available
MARS: leveraging allelic heterogeneity to increase power of association testing

https://doi.org/10.1186/s13059-021-02353-8

Hormozdiari, Farhad; Jung, Junghyun; Eskin, Eleazar; J. Joo, Jong Wha (December 2021, Genome Biology)
null (Ed.)
Abstract In standard genome-wide association studies (GWAS), the standard association test is underpowered to detect associations between loci with multiple causal variants with small effect sizes. We propose a statistical method, Model-based Association test Reflecting causal Status (MARS), that finds associations between variants in risk loci and a phenotype, considering the causal status of variants, only requiring the existing summary statistics to detect associated risk loci. Utilizing extensive simulated data and real data, we show that MARS increases the power of detecting true associated risk loci compared to previous approaches that consider multiple variants, while controlling the type I error.
more » « less
Full Text Available
Analysis of independent cohorts of outbred CFW mice reveals novel loci for behavioral and physiological traits and identifies factors determining reproducibility

https://doi.org/10.1093/g3journal/jkab394

Zou, Jennifer; Gopalakrishnan, Shyam; Parker, Clarissa C; Nicod, Jerome; Mott, Richard; Cai, Na; Lionikas, Arimantas; Davies, Robert W; Palmer, Abraham A; Flint, Jonathan (November 2021, G3 Genes|Genomes|Genetics)
Matise, T (Ed.)
Abstract Combining samples for genetic association is standard practice in human genetic analysis of complex traits, but is rarely undertaken in rodent genetics. Here, using 23 phenotypes and genotypes from two independent laboratories, we obtained a sample size of 3076 commercially available outbred mice and identified 70 loci, more than double the number of loci identified in the component studies. Fine-mapping in the combined sample reduced the number of likely causal variants, with a median reduction in set size of 51%, and indicated novel gene associations, including Pnpo, Ttll6, and GM11545 with bone mineral density, and Psmb9 with weight. However, replication at a nominal threshold of 0.05 between the two component studies was low, with less than one-third of loci identified in one study replicated in the second. In addition to overestimates in the effect size in the discovery sample (Winner’s Curse), we also found that heterogeneity between studies explained the poor replication, but the contribution of these two factors varied among traits. Leveraging these observations, we integrated information about replication rates, study-specific heterogeneity, and Winner’s Curse corrected estimates of power to assign variants to one of four confidence levels. Our approach addresses concerns about reproducibility and demonstrates how to obtain robust results from mapping complex traits in any genome-wide association study.
more » « less
Full Text Available
PLEIO: a method to map and interpret pleiotropic loci with GWAS summary statistics

https://doi.org/10.1016/j.ajhg.2020.11.017

Lee, Cue Hyunkyu; Shi, Huwenbo; Pasaniuc, Bogdan; Eskin, Eleazar; Han, Buhm (January 2021, The American Journal of Human Genetics)
null (Ed.)
Full Text Available
Genome-Wide Association Study in Two Cohorts from a Multi-generational Mouse Advanced Intercross Line Highlights the Difficulty of Replication Due to Study-Specific Heterogeneity

https://doi.org/10.1534/g3.119.400763

Zhou, Xinzhu; St. Pierre, Celine L; Gonzales, Natalia M; Zou, Jennifer; Cheng, Riyan; Chitre, Apurva S; Sokoloff, Greta; Palmer, Abraham A (March 2020, G3 Genes|Genomes|Genetics)
null (Ed.)
Abstract There has been extensive discussion of the “Replication Crisis” in many fields, including genome-wide association studies (GWAS). We explored replication in a mouse model using an advanced intercross line (AIL), which is a multigenerational intercross between two inbred strains. We re-genotyped a previously published cohort of LG/J x SM/J AIL mice (F34; n = 428) using a denser marker set and genotyped a new cohort of AIL mice (F39-43; n = 600) for the first time. We identified 36 novel genome-wide significant loci in the F34 and 25 novel loci in the F39-43 cohort. The subset of traits that were measured in both cohorts (locomotor activity, body weight, and coat color) showed high genetic correlations, although the SNP heritabilities were slightly lower in the F39-43 cohort. For this subset of traits, we attempted to replicate loci identified in either F34 or F39-43 in the other cohort. Coat color was robustly replicated; locomotor activity and body weight were only partially replicated, which was inconsistent with our power simulations. We used a random effects model to show that the partial replications could not be explained by Winner’s Curse but could be explained by study-specific heterogeneity. Despite this heterogeneity, we performed a mega-analysis by combining F34 and F39-43 cohorts (n = 1,028), which identified four novel loci associated with locomotor activity and body weight. These results illustrate that even with the high degree of genetic and environmental control possible in our experimental system, replication was hindered by study-specific heterogeneity, which has broad implications for ongoing concerns about reproducibility.
more » « less
Full Text Available

Search for: All records