skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Barcoded bulk QTL mapping reveals highly polygenic and epistatic architecture of complex traits in yeast
Mapping the genetic basis of complex traits is critical to uncovering the biological mechanisms that underlie disease and other phenotypes. Genome-wide association studies (GWAS) in humans and quantitative trait locus (QTL) mapping in model organisms can now explain much of the observed heritability in many traits, allowing us to predict phenotype from genotype. However, constraints on power due to statistical confounders in large GWAS and smaller sample sizes in QTL studies still limit our ability to resolve numerous small-effect variants, map them to causal genes, identify pleiotropic effects across multiple traits, and infer non-additive interactions between loci (epistasis). Here, we introduce barcoded bulk quantitative trait locus (BB-QTL) mapping, which allows us to construct, genotype, and phenotype 100,000 offspring of a budding yeast cross, two orders of magnitude larger than the previous state of the art. We use this panel to map the genetic basis of eighteen complex traits, finding that the genetic architecture of these traits involves hundreds of small-effect loci densely spaced throughout the genome, many with widespread pleiotropic effects across multiple traits. Epistasis plays a central role, with thousands of interactions that provide insight into genetic networks. By dramatically increasing sample size, BB-QTL mapping demonstrates the potential of natural variants in high-powered QTL studies to reveal the highly polygenic, pleiotropic, and epistatic architecture of complex traits.  more » « less
Award ID(s):
1764269 1914916
PAR ID:
10328876
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
eLife
Volume:
11
ISSN:
2050-084X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Goss's wilt, caused by the Gram-positive actinobacterium Clavibacter nebraskensis, is an important bacterial disease of maize. The molecular and genetic mechanisms of resistance to the bacterium, or, in general, Gram-positive bacteria causing plant diseases, remain poorly understood. Here, we examined the genetic basis of Goss's wilt through differential gene expression, standard genome-wide association mapping (GWAS), extreme phenotype (XP) GWAS using highly resistant (R) and highly susceptible (S) lines, and quantitative trait locus (QTL) mapping using 3 bi-parental populations, identifying 11 disease association loci. Three loci were validated using near-isogenic lines or recombinant inbred lines. Our analysis indicates that Goss's wilt resistance is highly complex and major resistance genes are not commonly present. RNA sequencing of samples separately pooled from R and S lines with or without bacterial inoculation was performed, enabling identification of common and differential gene responses in R and S lines. Based on expression, in both R and S lines, the photosynthesis pathway was silenced upon infection, while stress-responsive pathways and phytohormone pathways, namely, abscisic acid, auxin, ethylene, jasmonate, and gibberellin, were markedly activated. In addition, 65 genes showed differential responses (up- or down-regulated) to infection in R and S lines. Combining genetic mapping and transcriptional data, individual candidate genes conferring Goss's wilt resistance were identified. Collectively, aspects of the genetic architecture of Goss's wilt resistance were revealed, providing foundational data for mechanistic studies. 
    more » « less
  2. Macdonald, S (Ed.)
    Abstract Quantitative genetics in Caenorhabditis elegans seeks to identify naturally segregating genetic variants that underlie complex traits. Genome-wide association studies scan the genome for individual genetic variants that are significantly correlated with phenotypic variation in a population, or quantitative trait loci. Genome-wide association studies are a popular choice for quantitative genetic analyses because the quantitative trait loci that are discovered segregate in natural populations. Despite numerous successful mapping experiments, the empirical performance of genome-wide association study has not, to date, been formally evaluated in C. elegans. We developed an open-source genome-wide association study pipeline called NemaScan and used a simulation-based approach to provide benchmarks of mapping performance in collections of wild C. elegans strains. Simulated trait heritability and complexity determined the spectrum of quantitative trait loci detected by genome-wide association studies. Power to detect smaller-effect quantitative trait loci increased with the number of strains sampled from the C. elegans Natural Diversity Resource. Population structure was a major driver of variation in mapping performance, with populations shaped by recent selection exhibiting significantly lower false discovery rates than populations composed of more divergent strains. We also recapitulated previous genome-wide association studies of experimentally validated quantitative trait variants. Our simulation-based evaluation of performance provides the community with critical context to pursue quantitative genetic studies using the C. elegans Natural Diversity Resource to elucidate the genetic basis of complex traits in C. elegans natural populations. 
    more » « less
  3. Abstract Data reduction methods are frequently employed in large genomics and phenomics studies to extract core patterns, reduce dimensionality, and alleviate multiple testing effects. Principal component analysis (PCA), in particular, identifies the components that capture the most variance within omics datasets. While data reduction can simplify complex datasets, it remains unclear how the use of PCA impacts downstream analyses such as quantitative trait loci (QTL) or genome-wide association (GWA) approaches and their biological interpretation. In QTL studies, an alternative to data reduction is the use of post-hoc data summarization approaches, such as hotspot analysis, which involves mapping individual traits and consolidating results based on shared genomic locations. To evaluate how different analytical approaches may alter the biological insights derived from multi-dimensional QTL datasets, we compared individual trait hotspots with PCA-based QTL mapping using transcriptomic and metabolomic data from a structured recombinant inbred line population. Interestingly, these two approaches identified different genomic regions and genetic architectures. These findings suggest that mapping PCA-reduced data does not merely streamline analyses but may generate a fundamentally different view of the underlying genetic architecture compared to individual trait mapping and hotspot analysis. Thus, the use of PCA and other data reduction techniques prior to QTL or GWAS mapping should be carefully considered to ensure alignment with the specific biological question being addressed. 
    more » « less
  4. Abstract Invasive species offer outstanding opportunities to identify the genomic sources of variation that contribute to rapid adaptation, as well as the genetic mechanisms facilitating invasions. The Eurasian plant yellow starthistle (Centaurea solstitialis) is highly invasive in North and South American grasslands and known to have evolved increased growth and reproduction during invasion. Here, we develop new genomic resources for C. solstitialis and map the genetic basis of invasiveness traits. We present a chromosome-scale (1N = 8) reference genome using PacBio CLR and Dovetail Omni-C technologies, and functional gene annotation using RNAseq. We find repeat structure typical of the family Asteraceae, with over 25% of gene content derived from ancestral whole-genome duplications (paleologs). Using an F2 mapping population derived from a cross between native and invading parents, with a restriction site-associated DNA (RAD)-based genetic map, we validate the assembly and identify 13 quantitative trait loci underpinning size traits that have evolved during invasion. We find evidence that large effects of quantitative trait loci may be associated with structural variants between native and invading genotypes, including a variant with an overdominant and pleiotropic effect on key invader traits. We also find evidence of significant paleolog enrichment under two quantitative trait loci. Our results add to growing evidence of the importance of structural variants in evolution, and to understanding of the rapid evolution of invaders. 
    more » « less
  5. Abstract Classical genetic studies have identified many cases of pleiotropy where mutations in individual genes alter many different phenotypes. Quantitative genetic studies of natural genetic variants frequently examine one or a few traits, limiting their potential to identify pleiotropic effects of natural genetic variants. Widely adopted community association panels have been employed by plant genetics communities to study the genetic basis of naturally occurring phenotypic variation in a wide range of traits. High-density genetic marker data—18M markers—from 2 partially overlapping maize association panels comprising 1,014 unique genotypes grown in field trials across at least 7 US states and scored for 162 distinct trait data sets enabled the identification of of 2,154 suggestive marker-trait associations and 697 confident associations in the maize genome using a resampling-based genome-wide association strategy. The precision of individual marker-trait associations was estimated to be 3 genes based on a reference set of genes with known phenotypes. Examples were observed of both genetic loci associated with variation in diverse traits (e.g., above-ground and below-ground traits), as well as individual loci associated with the same or similar traits across diverse environments. Many significant signals are located near genes whose functions were previously entirely unknown or estimated purely via functional data on homologs. This study demonstrates the potential of mining community association panel data using new higher-density genetic marker sets combined with resampling-based genome-wide association tests to develop testable hypotheses about gene functions, identify potential pleiotropic effects of natural genetic variants, and study genotype-by-environment interaction. 
    more » « less