NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Scalable summary-statistics-based heritability estimation method with individual genotype level accuracy

https://doi.org/10.1101/gr.279207.124

Jeong, Moonseong; Pazokitoroudi, Ali; Liu, Zhengtong; Sankararaman, Sriram (September 2024, Genome Research)

SNP heritability, the proportion of phenotypic variation explained by genotyped SNPs, is an important parameter in understanding the genetic architecture underlying various diseases and traits. Methods that aim to estimate SNP heritability from individual genotype and phenotype data are limited by their ability to scale to Biobank-scale data sets and by the restrictions in access to individual-level data. These limitations have motivated the development of methods that only require summary statistics. Although the availability of publicly accessible summary statistics makes them widely applicable, these methods lack the accuracy of methods that utilize individual genotypes. Here we present a SUMmary-statistics-based Randomized Haseman-Elston regression (SUM-RHE), a method that can estimate the SNP heritability of complex phenotypes with accuracies comparable to approaches that require individual genotypes, while exclusively relying on summary statistics. SUM-RHE employs Genome-Wide Association Study (GWAS) summary statistics and statistics obtained on a reference population, which can be efficiently estimated and readily shared for public use. Our results demonstrate that SUM-RHE obtains estimates of SNP heritability that are substantially more accurate compared with other summary statistic methods and on par with methods that rely on individual-level data.
more » « less
Full Text Available
A scalable and robust variance components method reveals insights into the architecture of gene-environment interactions underlying complex traits

https://doi.org/10.1016/j.ajhg.2024.05.015

Pazokitoroudi, Ali; Liu, Zhengtong; Dahl, Andrew; Zaitlen, Noah; Rosset, Saharon; Sankararaman, Sriram (July 2024, The American Journal of Human Genetics)

Full Text Available
Characterizing the genetic architecture of drug response using gene-context interaction methods

https://doi.org/10.1016/j.xgen.2024.100722

Sadowski, Michal; Thompson, Mike; Mefford, Joel; Haldar, Tanushree; Oni-Orisan, Akinyemi; Border, Richard; Pazokitoroudi, Ali; Cai, Na; Ayroles, Julien F; Sankararaman, Sriram; et al (December 2024, Cell Genomics)

Full Text Available
Fast kernel-based association testing of non-linear genetic effects for biobank-scale data

https://doi.org/10.1038/s41467-023-40346-2

Fu, Boyang; Pazokitoroudi, Ali; Sudarshan, Mukund; Liu, Zhengtong; Subramanian, Lakshminarayanan; Sankararaman, Sriram (December 2023, Nature Communications)

Abstract Our knowledge of non-linear genetic effects on complex traits remains limited, in part, due to the modest power to detect such effects. While kernel-based tests offer a versatile approach to test for non-linear relationships between sets of genetic variants and traits, current approaches cannot be applied to Biobank-scale datasets containing hundreds of thousands of individuals. We propose, FastKAST, a kernel-based approach that can test for non-linear effects of a set of variants on a quantitative trait. FastKAST provides calibrated hypothesis tests while enabling analysis of Biobank-scale datasets with hundreds of thousands of unrelated individuals from a homogeneous population. We apply FastKAST to 53 quantitative traits measured across ≈ 300 K unrelated white British individuals in the UK Biobank to detect sets of variants with non-linear effects at genome-wide significance.
more » « less
Full Text Available
The lingering effects of Neanderthal introgression on human complex traits

https://doi.org/10.7554/eLife.80757

Wei, Xinzhu; Robles, Christopher R; Pazokitoroudi, Ali; Ganna, Andrea; Gusev, Alexander; Durvasula, Arun; Gazal, Steven; Loh, Po-Ru; Reich, David; Sankararaman, Sriram (March 2023, eLife)

The genetic variants introduced into the ancestors of modern humans from interbreeding with Neanderthals have been suggested to contribute an unexpected extent to complex human traits. However, testing this hypothesis has been challenging due to the idiosyncratic population genetic properties of introgressed variants. We developed rigorous methods to assess the contribution of introgressed Neanderthal variants to heritable trait variation and applied these methods to analyze 235,592 introgressed Neanderthal variants and 96 distinct phenotypes measured in about 300,000 unrelated white British individuals in the UK Biobank. Introgressed Neanderthal variants make a significant contribution to trait variation (explaining 0.12% of trait variation on average). However, the contribution of introgressed variants tends to be significantly depleted relative to modern human variants matched for allele frequency and linkage disequilibrium (about 59% depletion on average), consistent with purifying selection on introgressed variants. Different from previous studies (McArthur et al., 2021), we find no evidence for elevated heritability across the phenotypes examined. We identified 348 independent significant associations of introgressed Neanderthal variants with 64 phenotypes. Previous work (Skov et al., 2020) has suggested that a majority of such associations are likely driven by statistical association with nearby modern human variants that are the true causal variants. Applying a customized fine-mapping led us to identify 112 regions across 47 phenotypes containing 4303 unique genetic variants where introgressed variants are highly likely to have a phenotypic effect. Examination of these variants reveals their substantial impact on genes that are important for the immune system, development, and metabolism.
more » « less
Full Text Available
Deep learning-based phenotype imputation on population-scale biobank data increases genetic discoveries

https://doi.org/10.1038/s41588-023-01558-w

An, Ulzee; Pazokitoroudi, Ali; Alvarez, Marcus; Huang, Lianyun; Bacanu, Silviu; Schork, Andrew J.; Kendler, Kenneth; Pajukanta, Päivi; Flint, Jonathan; Zaitlen, Noah; et al (November 2023, Nature Genetics)

Abstract Biobanks that collect deep phenotypic and genomic data across many individuals have emerged as a key resource in human genetics. However, phenotypes in biobanks are often missing across many individuals, limiting their utility. We propose AutoComplete, a deep learning-based imputation method to impute or ‘fill-in’ missing phenotypes in population-scale biobank datasets. When applied to collections of phenotypes measured across ~300,000 individuals from the UK Biobank, AutoComplete substantially improved imputation accuracy over existing methods. On three traits with notable amounts of missingness, we show that AutoComplete yields imputed phenotypes that are genetically similar to the originally observed phenotypes while increasing the effective sample size by about twofold on average. Further, genome-wide association analyses on the resulting imputed phenotypes led to a substantial increase in the number of associated loci. Our results demonstrate the utility of deep learning-based phenotype imputation to increase power for genetic discoveries in existing biobank datasets.
more » « less
Efficient variance components analysis across millions of genomes

https://doi.org/10.1038/s41467-020-17576-9

Pazokitoroudi, Ali; Wu, Yue; Burch, Kathryn S.; Hou, Kangcheng; Zhou, Aaron; Pasaniuc, Bogdan; Sankararaman, Sriram (August 2020, Nature Communications)

Abstract While variance components analysis has emerged as a powerful tool in complex trait genetics, existing methods for fitting variance components do not scale well to large-scale datasets of genetic variation. Here, we present a method for variance components analysis that is accurate and efficient: capable of estimating one hundred variance components on a million individuals genotyped at a million SNPs in a few hours. We illustrate the utility of our method in estimating and partitioning variation in a trait explained by genotyped SNPs (SNP-heritability). Analyzing 22 traits with genotypes from 300,000 individuals across about 8 million common and low frequency SNPs, we observe that per-allele squared effect size increases with decreasing minor allele frequency (MAF) and linkage disequilibrium (LD) consistent with the action of negative selection. Partitioning heritability across 28 functional annotations, we observe enrichment of heritability in FANTOM5 enhancers in asthma, eczema, thyroid and autoimmune disorders.
more » « less

Search for: All records