skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A two-step approach to testing overall effect of gene–environment interaction for multiple phenotypes
Abstract Motivation While gene–environment (GxE) interactions contribute importantly to many different phenotypes, detecting such interactions requires well-powered studies and has proven difficult. To address this, we combine two approaches to improve GxE power: simultaneously evaluating multiple phenotypes and using a two-step analysis approach. Previous work shows that the power to identify a main genetic effect can be improved by simultaneously analyzing multiple related phenotypes. For a univariate phenotype, two-step methods produce higher power for detecting a GxE interaction compared to single step analysis. Therefore, we propose a two-step approach to test for an overall GxE effect for multiple phenotypes. Results Using simulations we demonstrate that, when more than one phenotype has GxE effect (i.e. GxE pleiotropy), our approach offers substantial gain in power (18–43%) to detect an aggregate-level GxE effect for a multivariate phenotype compared to an analogous two-step method to identify GxE effect for a univariate phenotype. We applied the proposed approach to simultaneously analyze three lipids, LDL, HDL and Triglyceride with the frequency of alcohol consumption as environmental factor in the UK Biobank. The method identified two loci with an overall GxE effect on the vector of lipids, one of which was missed by the competing approaches. Availability and implementation We provide an R package MPGE implementing the proposed approach which is available from CRAN: https://cran.r-project.org/web/packages/MPGE/index.html Supplementary information Supplementary data are available at Bioinformatics online.  more » « less
Award ID(s):
1943497 1705121
PAR ID:
10257078
Author(s) / Creator(s):
; ; ; ; ; ;
Editor(s):
Schwartz, Russell
Date Published:
Journal Name:
Bioinformatics
Volume:
36
Issue:
24
ISSN:
1367-4803
Page Range / eLocation ID:
5640 to 5648
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Wittkopp, Patricia (Ed.)
    Abstract The relationship between genotype and phenotype is often mediated by the environment. Moreover, gene-by-environment (GxE) interactions can contribute to variation in phenotypes and fitness. In the last 500 yr, house mice have invaded the Americas. Despite their short residence time, there is evidence of rapid climate adaptation, including shifts in body size and aspects of metabolism with latitude. Previous selection scans have identified candidate genes for metabolic adaptation. However, environmental variation in diet as well as GxE interactions likely impact body mass variation in wild populations. Here, we investigated the role of the environment and GxE interactions in shaping adaptive phenotypic variation. Using new locally adapted inbred strains from North and South America, we evaluated response to a high-fat diet, finding that sex, strain, diet, and the interaction between strain and diet contributed significantly to variation in body size. We also found that the transcriptional response to diet is largely strain-specific, indicating that GxE interactions affecting gene expression are pervasive. Next, we used crosses between strains from contrasting climates to characterize gene expression regulatory divergence on a standard diet and on a high-fat diet. We found that gene regulatory divergence is often condition-specific, particularly for trans-acting changes. Finally, we found evidence for lineage-specific selection on cis-regulatory variation involved in diverse processes, including lipid metabolism. Overlap with scans for selection identified candidate genes for environmental adaptation with diet-specific effects. Together, our results underscore the importance of environmental variation and GxE interactions in shaping adaptive variation in complex traits. 
    more » « less
  2. ABSTRACT Microfluidic devices (MDs) present a novel method for detecting circulating tumor cells (CTCs), enhancing the process through targeted techniques and visual inspection. However, current approaches often yield heterogeneous CTC populations, necessitating additional processing for comprehensive analysis and phenotype identification. These procedures are often expensive, time‐consuming, and need to be performed by skilled technicians. In this study, we investigate the potential of a cost‐effective and efficient hyperuniform micropost MD approach for CTC classification. Our approach combines mathematical modeling of fluid–structure interactions in a simulated microfluidic channel with machine learning techniques. Specifically, we developed a cell‐based modeling framework to assess CTC dynamics in erythrocyte‐laden plasma flow, generating a large dataset of CTC trajectories that account for two distinct CTC phenotypes. Convolutional neural network (CNN) and recurrent neural network (RNN) were then employed to analyze the dataset and classify these phenotypes. The results demonstrate the potential effectiveness of the hyperuniform micropost MD design and analysis approach in distinguishing between different CTC phenotypes based on cell trajectory, offering a promising avenue for early cancer detection. 
    more » « less
  3. Abstract Correlation among multiple phenotypes across related individuals may reflect some pattern of shared genetic architecture: individual genetic loci affect multiple phenotypes (an effect known as pleiotropy), creating observable relationships between phenotypes. A natural hypothesis is that pleiotropic effects reflect a relatively small set of common “core” cellular processes: each genetic locus affects one or a few core processes, and these core processes in turn determine the observed phenotypes. Here, we propose a method to infer such structure in genotype–phenotype data. Our approach, sparse structure discovery (SSD) is based on a penalized matrix decomposition designed to identify latent structure that is low-dimensional (many fewer core processes than phenotypes and genetic loci), locus-sparse (each locus affects few core processes), and/or phenotype-sparse (each phenotype is influenced by few core processes). Our use of sparsity as a guide in the matrix decomposition is motivated by the results of a novel empirical test indicating evidence of sparse structure in several recent genotype–phenotype datasets. First, we use synthetic data to show that our SSD approach can accurately recover core processes if each genetic locus affects few core processes or if each phenotype is affected by few core processes. Next, we apply the method to three datasets spanning adaptive mutations in yeast, genotoxin robustness assay in human cell lines, and genetic loci identified from a yeast cross, and evaluate the biological plausibility of the core process identified. More generally, we propose sparsity as a guiding prior for resolving latent structure in empirical genotype–phenotype maps. 
    more » « less
  4. Mapping the genetic basis of complex traits is critical to uncovering the biological mechanisms that underlie disease and other phenotypes. Genome-wide association studies (GWAS) in humans and quantitative trait locus (QTL) mapping in model organisms can now explain much of the observed heritability in many traits, allowing us to predict phenotype from genotype. However, constraints on power due to statistical confounders in large GWAS and smaller sample sizes in QTL studies still limit our ability to resolve numerous small-effect variants, map them to causal genes, identify pleiotropic effects across multiple traits, and infer non-additive interactions between loci (epistasis). Here, we introduce barcoded bulk quantitative trait locus (BB-QTL) mapping, which allows us to construct, genotype, and phenotype 100,000 offspring of a budding yeast cross, two orders of magnitude larger than the previous state of the art. We use this panel to map the genetic basis of eighteen complex traits, finding that the genetic architecture of these traits involves hundreds of small-effect loci densely spaced throughout the genome, many with widespread pleiotropic effects across multiple traits. Epistasis plays a central role, with thousands of interactions that provide insight into genetic networks. By dramatically increasing sample size, BB-QTL mapping demonstrates the potential of natural variants in high-powered QTL studies to reveal the highly polygenic, pleiotropic, and epistatic architecture of complex traits. 
    more » « less
  5. Abstract MotivationMany variants identified by genome-wide association studies (GWAS) have been found to affect multiple traits, either directly or through shared pathways. There is currently a wealth of GWAS data collected in numerous phenotypes, and analyzing multiple traits at once can increase power to detect shared variant effects. However, traditional meta-analysis methods are not suitable for combining studies on different traits. When applied to dissimilar studies, these meta-analysis methods can be underpowered compared to univariate analysis. The degree to which traits share variant effects is often not known, and the vast majority of GWAS meta-analysis only consider one trait at a time. ResultsHere, we present a flexible method for finding associated variants from GWAS summary statistics for multiple traits. Our method estimates the degree of shared effects between traits from the data. Using simulations, we show that our method properly controls the false positive rate and increases power when an effect is present in a subset of traits. We then apply our method to the North Finland Birth Cohort and UK Biobank datasets using a variety of metabolic traits and discover novel loci. Availability and implementationOur source code is available at https://github.com/lgai/CONFIT. Supplementary informationSupplementary data are available at Bioinformatics online. 
    more » « less