IntroductionAutoimmune disorders (ADs) are a group of about 80 disorders that occur when self-attacking autoantibodies are produced due to failure in the self-tolerance mechanisms. ADs are polygenic disorders and associations with genes both in the human leukocyte antigen (HLA) region and outside of it have been described. Previous studies have shown that they are highly comorbid with shared genetic risk factors, while epidemiological studies revealed associations between various lifestyle and health-related phenotypes and ADs. MethodsHere, for the first time, we performed a comparative polygenic risk score (PRS) - Phenome Wide Association Study (PheWAS) for 11 different ADs (Juvenile Idiopathic Arthritis, Primary Sclerosing Cholangitis, Celiac Disease, Multiple Sclerosis, Rheumatoid Arthritis, Psoriasis, Myasthenia Gravis, Type 1 Diabetes, Systemic Lupus Erythematosus, Vitiligo Late Onset, Vitiligo Early Onset) and 3,254 phenotypes available in the UK Biobank that include a wide range of socio-demographic, lifestyle and health-related outcomes. Additionally, we investigated the genetic relationships of the studied ADs, calculating their genetic correlation and conducting cross-disorder GWAS meta-analyses for the observed AD clusters. ResultsIn total, we identified 508 phenotypes significantly associated with at least one AD PRS. 272 phenotypes were significantly associated after excluding variants in the HLA region from the PRS estimation. Through genetic correlation and genetic factor analyses, we identified four genetic factors that run across studied ADs. Cross-trait meta-analyses within each factor revealed pleiotropic genome-wide significant loci. DiscussionOverall, our study confirms the association of different factors with genetic susceptibility for ADs and reveals novel observations that need to be further explored.
more »
« less
A data-adaptive Bayesian regression approach for polygenic risk prediction
Abstract MotivationPolygenic risk score (PRS) has been widely exploited for genetic risk prediction due to its accuracy and conceptual simplicity. We introduce a unified Bayesian regression framework, NeuPred, for PRS construction, which accommodates varying genetic architectures and improves overall prediction accuracy for complex diseases by allowing for a wide class of prior choices. To take full advantage of the framework, we propose a summary-statistics-based cross-validation strategy to automatically select suitable chromosome-level priors, which demonstrates a striking variability of the prior preference of each chromosome, for the same complex disease, and further significantly improves the prediction accuracy. ResultsSimulation studies and real data applications with seven disease datasets from the Wellcome Trust Case Control Consortium cohort and eight groups of large-scale genome-wide association studies demonstrate that NeuPred achieves substantial and consistent improvements in terms of predictive r2 over existing methods. In addition, NeuPred has similar or advantageous computational efficiency compared with the state-of-the-art Bayesian methods. Availability and implementationThe R package implementing NeuPred is available at https://github.com/shuangsong0110/NeuPred. Supplementary informationSupplementary data are available at Bioinformatics online.
more »
« less
- PAR ID:
- 10364606
- Publisher / Repository:
- Oxford University Press
- Date Published:
- Journal Name:
- Bioinformatics
- Volume:
- 38
- Issue:
- 7
- ISSN:
- 1367-4803
- Format(s):
- Medium: X Size: p. 1938-1946
- Size(s):
- p. 1938-1946
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Complex disorders are caused by a combination of genetic, environmental and lifestyle factors, and their prevalence can vary greatly across different populations. The extent to which genetic risk, as identified by Genome Wide Association Study (GWAS), correlates to disease prevalence in different populations has not been investigated systematically. Here, we studied 14 different complex disorders and explored whether polygenic risk scores (PRS) based on current GWAS correlate to disease prevalence within Europe and around the world. A clear variation in GWAS-based genetic risk was observed based on ancestry and we identified populations that have a higher genetic liability for developing certain disorders. We found that for four out of the 14 studied disorders, PRS significantly correlates to disease prevalence within Europe. We also found significant correlations between worldwide disease prevalence and PRS for eight of the studied disorders with Multiple Sclerosis genetic risk having the highest correlation to disease prevalence. Based on current GWAS results, the across population differences in genetic risk for certain disorders can potentially be used to understand differences in disease prevalence and identify populations with the highest genetic liability. The study highlights both the limitations of PRS based on current GWAS but also the fact that in some cases, PRS may already have high predictive power. This could be due to the genetic architecture of specific disorders or increased GWAS power in some cases.more » « less
-
Abstract Biobanks often contain several phenotypes relevant to diseases such as major depressive disorder (MDD), with partly distinct genetic architectures. Researchers face complex tradeoffs between shallow (large sample size, low specificity/sensitivity) and deep (small sample size, high specificity/sensitivity) phenotypes, and the optimal choices are often unclear. Here we propose to integrate these phenotypes to combine the benefits of each. We use phenotype imputation to integrate information across hundreds of MDD-relevant phenotypes, which significantly increases genome-wide association study (GWAS) power and polygenic risk score (PRS) prediction accuracy of the deepest available MDD phenotype in UK Biobank, LifetimeMDD. We demonstrate that imputation preserves specificity in its genetic architecture using a novel PRS-based pleiotropy metric. We further find that integration via summary statistics also enhances GWAS power and PRS predictions, but can introduce nonspecific genetic effects depending on input. Our work provides a simple and scalable approach to improve genetic studies in large biobanks by integrating shallow and deep phenotypes.more » « less
-
Abstract MotivationA large proportion of risk regions identified by genome-wide association studies (GWAS) are shared across multiple diseases and traits. Understanding whether this clustering is due to sharing of causal variants or chance colocalization can provide insights into shared etiology of complex traits and diseases. ResultsIn this work, we propose a flexible, unifying framework to quantify the overlap between a pair of traits called UNITY (Unifying Non-Infinitesimal Trait analYsis). We formulate a Bayesian generative model that relates the overlap between pairs of traits to GWAS summary statistic data under a non-infinitesimal genetic architecture underlying each trait. We propose a Metropolis–Hastings sampler to compute the posterior density of the genetic overlap parameters in this model. We validate our method through comprehensive simulations and analyze summary statistics from height and body mass index GWAS to show that it produces estimates consistent with the known genetic makeup of both traits. Availability and implementationThe UNITY software is made freely available to the research community at: https://github.com/bogdanlab/UNITY. Supplementary informationSupplementary data are available at Bioinformatics online.more » « less
-
null (Ed.)Background: Both lifestyle and genetic factors confer risk for cardiovascular diseases, type 2 diabetes, and dyslipidemia. However, the interactions between these 2 groups of risk factors were not comprehensively understood due to previous poor estimation of genetic risk. Here we set out to develop enhanced polygenic risk scores (PRS) and systematically investigate multiplicative and additive interactions between PRS and lifestyle for coronary artery disease, atrial fibrillation, type 2 diabetes, total cholesterol, triglyceride, and LDL-cholesterol. Methods: Our study included 276 096 unrelated White British participants from the UK Biobank. We investigated several PRS methods (P+T, LDpred, PRS continuous shrinkage, and AnnoPred) and showed that AnnoPred achieved consistently improved prediction accuracy for all 6 diseases/traits. With enhanced PRS and combined lifestyle status categorized by smoking, body mass index, physical activity, and diet, we investigated both multiplicative and additive interactions between PRS and lifestyle using regression models. Results: We observed that healthy lifestyle reduced disease incidence by similar multiplicative magnitude across different PRS groups. The absolute risk reduction from lifestyle adherence was, however, significantly greater in individuals with higher PRS. Specifically, for type 2 diabetes, the absolute risk reduction from lifestyle adherence was 12.4% (95% CI, 10.0%–14.9%) in the top 1% PRS versus 2.8% (95% CI, 2.3%–3.3%) in the bottom PRS decile, leading to a ratio of >4.4. We also observed a significant interaction effect between PRS and lifestyle on triglyceride level. Conclusions: By leveraging functional annotations, AnnoPred outperforms state-of-the-art methods on quantifying genetic risk through PRS. Our analyses based on enhanced PRS suggest that individuals with high genetic risk may derive similar relative but greater absolute benefit from lifestyle adherence.more » « less