skip to main content


Title: Analysis of independent cohorts of outbred CFW mice reveals novel loci for behavioral and physiological traits and identifies factors determining reproducibility
Abstract Combining samples for genetic association is standard practice in human genetic analysis of complex traits, but is rarely undertaken in rodent genetics. Here, using 23 phenotypes and genotypes from two independent laboratories, we obtained a sample size of 3076 commercially available outbred mice and identified 70 loci, more than double the number of loci identified in the component studies. Fine-mapping in the combined sample reduced the number of likely causal variants, with a median reduction in set size of 51%, and indicated novel gene associations, including Pnpo, Ttll6, and GM11545 with bone mineral density, and Psmb9 with weight. However, replication at a nominal threshold of 0.05 between the two component studies was low, with less than one-third of loci identified in one study replicated in the second. In addition to overestimates in the effect size in the discovery sample (Winner’s Curse), we also found that heterogeneity between studies explained the poor replication, but the contribution of these two factors varied among traits. Leveraging these observations, we integrated information about replication rates, study-specific heterogeneity, and Winner’s Curse corrected estimates of power to assign variants to one of four confidence levels. Our approach addresses concerns about reproducibility and demonstrates how to obtain robust results from mapping complex traits in any genome-wide association study.  more » « less
Award ID(s):
1910885
NSF-PAR ID:
10348476
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ;
Editor(s):
Matise, T
Date Published:
Journal Name:
G3 Genes|Genomes|Genetics
Volume:
12
Issue:
1
ISSN:
2160-1836
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Genome-wide association studies (GWAS) have identified thousands of genetic variants associated with complex human traits, but only a fraction of variants identified in discovery studies achieve significance in replication studies. Replication in genome-wide association studies has been well-studied in the context of Winner’s Curse, which is the inflation of effect size estimates for significant variants due to statistical chance. However, Winner’s Curse is often not sufficient to explain lack of replication. Another reason why studies fail to replicate is that there are fundamental differences between the discovery and replication studies. A confounding factor can create the appearance of a significant finding while actually being an artifact that will not replicate in future studies. We propose a statistical framework that utilizes genome-wide association studies and replication studies to jointly model Winner’s Curse and study-specific heterogeneity due to confounding factors. We apply this framework to 100 genome-wide association studies from the Human Genome-Wide Association Studies Catalog and observe that there is a large range in the level of estimated confounding. We demonstrate how this framework can be used to distinguish when studies fail to replicate due to statistical noise and when they fail due to confounding.

     
    more » « less
  2. null (Ed.)
    Abstract There has been extensive discussion of the “Replication Crisis” in many fields, including genome-wide association studies (GWAS). We explored replication in a mouse model using an advanced intercross line (AIL), which is a multigenerational intercross between two inbred strains. We re-genotyped a previously published cohort of LG/J x SM/J AIL mice (F34; n = 428) using a denser marker set and genotyped a new cohort of AIL mice (F39-43; n = 600) for the first time. We identified 36 novel genome-wide significant loci in the F34 and 25 novel loci in the F39-43 cohort. The subset of traits that were measured in both cohorts (locomotor activity, body weight, and coat color) showed high genetic correlations, although the SNP heritabilities were slightly lower in the F39-43 cohort. For this subset of traits, we attempted to replicate loci identified in either F34 or F39-43 in the other cohort. Coat color was robustly replicated; locomotor activity and body weight were only partially replicated, which was inconsistent with our power simulations. We used a random effects model to show that the partial replications could not be explained by Winner’s Curse but could be explained by study-specific heterogeneity. Despite this heterogeneity, we performed a mega-analysis by combining F34 and F39-43 cohorts (n = 1,028), which identified four novel loci associated with locomotor activity and body weight. These results illustrate that even with the high degree of genetic and environmental control possible in our experimental system, replication was hindered by study-specific heterogeneity, which has broad implications for ongoing concerns about reproducibility. 
    more » « less
  3. Mapping the genetic basis of complex traits is critical to uncovering the biological mechanisms that underlie disease and other phenotypes. Genome-wide association studies (GWAS) in humans and quantitative trait locus (QTL) mapping in model organisms can now explain much of the observed heritability in many traits, allowing us to predict phenotype from genotype. However, constraints on power due to statistical confounders in large GWAS and smaller sample sizes in QTL studies still limit our ability to resolve numerous small-effect variants, map them to causal genes, identify pleiotropic effects across multiple traits, and infer non-additive interactions between loci (epistasis). Here, we introduce barcoded bulk quantitative trait locus (BB-QTL) mapping, which allows us to construct, genotype, and phenotype 100,000 offspring of a budding yeast cross, two orders of magnitude larger than the previous state of the art. We use this panel to map the genetic basis of eighteen complex traits, finding that the genetic architecture of these traits involves hundreds of small-effect loci densely spaced throughout the genome, many with widespread pleiotropic effects across multiple traits. Epistasis plays a central role, with thousands of interactions that provide insight into genetic networks. By dramatically increasing sample size, BB-QTL mapping demonstrates the potential of natural variants in high-powered QTL studies to reveal the highly polygenic, pleiotropic, and epistatic architecture of complex traits. 
    more » « less
  4. Macdonald, S (Ed.)
    Abstract Quantitative genetics in Caenorhabditis elegans seeks to identify naturally segregating genetic variants that underlie complex traits. Genome-wide association studies scan the genome for individual genetic variants that are significantly correlated with phenotypic variation in a population, or quantitative trait loci. Genome-wide association studies are a popular choice for quantitative genetic analyses because the quantitative trait loci that are discovered segregate in natural populations. Despite numerous successful mapping experiments, the empirical performance of genome-wide association study has not, to date, been formally evaluated in C. elegans. We developed an open-source genome-wide association study pipeline called NemaScan and used a simulation-based approach to provide benchmarks of mapping performance in collections of wild C. elegans strains. Simulated trait heritability and complexity determined the spectrum of quantitative trait loci detected by genome-wide association studies. Power to detect smaller-effect quantitative trait loci increased with the number of strains sampled from the C. elegans Natural Diversity Resource. Population structure was a major driver of variation in mapping performance, with populations shaped by recent selection exhibiting significantly lower false discovery rates than populations composed of more divergent strains. We also recapitulated previous genome-wide association studies of experimentally validated quantitative trait variants. Our simulation-based evaluation of performance provides the community with critical context to pursue quantitative genetic studies using the C. elegans Natural Diversity Resource to elucidate the genetic basis of complex traits in C. elegans natural populations. 
    more » « less
  5. Abstract Background

    Genome wide association (GWA) studies demonstrate linkages between genetic variants and traits of interest. Here, we tested associations between single nucleotide polymorphisms (SNPs) in rice (Oryza sativa) and two root hair traits, root hair length (RHL) and root hair density (RHD). Root hairs are outgrowths of single cells on the root epidermis that aid in nutrient and water acquisition and have also served as a model system to study cell differentiation and tip growth. Using lines from the Rice Diversity Panel-1, we explored the diversity of root hair length and density across four subpopulations of rice (aus,indica,temperate japonica, andtropical japonica). GWA analysis was completed using the high-density rice array (HDRA) and the rice reference panel (RICE-RP) SNP sets.

    Results

    We identified 18 genomic regions related to root hair traits, 14 of which related to RHD and four to RHL. No genomic regions were significantly associated with both traits. Two regions overlapped with previously identified quantitative trait loci (QTL) associated with root hair density in rice. We identified candidate genes in these regions and present those with previously published expression data relevant to root hair development. We re-phenotyped a subset of lines with extreme RHD phenotypes and found that the variation in RHD was due to differences in cell differentiation, not cell size, indicating genes in an associated genomic region may influence root hair cell fate. The candidate genes that we identified showed little overlap with previously characterized genes in rice andArabidopsis.

    Conclusions

    Root hair length and density are quantitative traits with complex and independent genetic control in rice. The genomic regions described here could be used as the basis for QTL development and further analysis of the genetic control of root hair length and density. We present a list of candidate genes involved in root hair formation and growth in rice, many of which have not been previously identified as having a relation to root hair growth. Since little is known about root hair growth in grasses, these provide a guide for further research and crop improvement.

     
    more » « less