skip to main content


Title: Barcoded bulk QTL mapping reveals highly polygenic and epistatic architecture of complex traits in yeast
Mapping the genetic basis of complex traits is critical to uncovering the biological mechanisms that underlie disease and other phenotypes. Genome-wide association studies (GWAS) in humans and quantitative trait locus (QTL) mapping in model organisms can now explain much of the observed heritability in many traits, allowing us to predict phenotype from genotype. However, constraints on power due to statistical confounders in large GWAS and smaller sample sizes in QTL studies still limit our ability to resolve numerous small-effect variants, map them to causal genes, identify pleiotropic effects across multiple traits, and infer non-additive interactions between loci (epistasis). Here, we introduce barcoded bulk quantitative trait locus (BB-QTL) mapping, which allows us to construct, genotype, and phenotype 100,000 offspring of a budding yeast cross, two orders of magnitude larger than the previous state of the art. We use this panel to map the genetic basis of eighteen complex traits, finding that the genetic architecture of these traits involves hundreds of small-effect loci densely spaced throughout the genome, many with widespread pleiotropic effects across multiple traits. Epistasis plays a central role, with thousands of interactions that provide insight into genetic networks. By dramatically increasing sample size, BB-QTL mapping demonstrates the potential of natural variants in high-powered QTL studies to reveal the highly polygenic, pleiotropic, and epistatic architecture of complex traits.  more » « less
Award ID(s):
1764269 1914916
NSF-PAR ID:
10328876
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
eLife
Volume:
11
ISSN:
2050-084X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    The genetic architecture of phenotypic traits can affect the mode and tempo of trait evolution. Human‐altered environments can impose strong natural selection, where successful evolutionary adaptation requires swift and large phenotypic shifts. In these scenarios, theory predicts that adaptation is due to a few adaptive variants of large effect, but empirical studies that have revealed the genetic architecture of rapidly evolved phenotypes are rare, especially for populations inhabiting polluted environments.Funduluskillifish have repeatedly evolved adaptive resistance to extreme pollution in urban estuaries. Prior studies, including genome scans for signatures of natural selection, have revealed some of the genes and pathways important for evolved pollution resistance, and provide context for the genotype–phenotype association studies reported here. We created multiple quantitative trait locus (QTL) mapping families using progenitors from four different resistant populations, and using RAD‐seq genetically mapped variation in sensitivity (developmental perturbations) following embryonic exposure to a model toxicant PCB‐126. We found that one to two large‐effect QTL loci accounted for resistance to PCB‐mediated developmental toxicity. QTLs harbored candidate genes that govern the regulation of aryl hydrocarbon receptor (AHR) signaling. One QTL locus was shared across all populations and another was shared across three populations. One QTL locus showed strong signatures of recent natural selection in the corresponding wild population but another QTL locus did not. Some candidate genes for PCB resistance inferred from genome scans in wild populations were identified as QTL, but some key candidate genes were not. We conclude that rapidly evolved resistance to the developmental defects normally caused by PCB‐126 is governed by few genes of large effect. However, other aspects of resistance beyond developmental phenotypes may be governed by additional loci, such that comprehensive resistance to PCB‐126, and to the mixtures of chemicals that distinguish urban estuaries more broadly, may be more genetically complex.

     
    more » « less
  2. Abstract

    In many species, temperature‐sensitive phenotypic plasticity (i.e., an individual's phenotypic response to temperature) displays a positive correlation with latitude, a pattern presumed to reflect local adaptation. This geographical pattern raises two general questions: (a) Do a few large‐effect genes contribute to latitudinal variation in a trait? (b) Is the thermal plasticity of different traits regulated pleiotropically? To address the questions, we crossed individuals ofPlantago lanceolataderived from northern and southern European populations. Individuals naturally exhibited high and low thermal plasticity in floral reflectance and flowering time. We grew parents and offspring in controlled cool‐ and warm‐temperature environments, mimicking what plants would encounter in nature. We obtained genetic markers via genotype‐by‐sequencing, produced the first recombination map for this ecologically important nonmodel species, and performed quantitative trait locus (QTL) mapping of thermal plasticity and single‐environment values for both traits. We identified a large‐effect QTL that largely explained the reflectance plasticity differences between northern and southern populations. We identified multiple smaller‐effect QTLs affecting aspects of flowering time, one of which affected flowering time plasticity. The results indicate that the genetic architecture of thermal plasticity in flowering is more complex than for reflectance. One flowering time QTL showed strong cytonuclear interactions under cool temperatures. Reflectance and flowering plasticity QTLs did not colocalize, suggesting little pleiotropic genetic control and freedom for independent trait evolution. Such genetic information about the architecture of plasticity is environmentally important because it informs us about the potential for plasticity to offset negative effects of climate change.

     
    more » « less
  3. Abstract

    Goss's wilt, caused by the Gram-positive actinobacterium Clavibacter nebraskensis, is an important bacterial disease of maize. The molecular and genetic mechanisms of resistance to the bacterium, or, in general, Gram-positive bacteria causing plant diseases, remain poorly understood. Here, we examined the genetic basis of Goss's wilt through differential gene expression, standard genome-wide association mapping (GWAS), extreme phenotype (XP) GWAS using highly resistant (R) and highly susceptible (S) lines, and quantitative trait locus (QTL) mapping using 3 bi-parental populations, identifying 11 disease association loci. Three loci were validated using near-isogenic lines or recombinant inbred lines. Our analysis indicates that Goss's wilt resistance is highly complex and major resistance genes are not commonly present. RNA sequencing of samples separately pooled from R and S lines with or without bacterial inoculation was performed, enabling identification of common and differential gene responses in R and S lines. Based on expression, in both R and S lines, the photosynthesis pathway was silenced upon infection, while stress-responsive pathways and phytohormone pathways, namely, abscisic acid, auxin, ethylene, jasmonate, and gibberellin, were markedly activated. In addition, 65 genes showed differential responses (up- or down-regulated) to infection in R and S lines. Combining genetic mapping and transcriptional data, individual candidate genes conferring Goss's wilt resistance were identified. Collectively, aspects of the genetic architecture of Goss's wilt resistance were revealed, providing foundational data for mechanistic studies.

     
    more » « less
  4. INTRODUCTION Genome-wide association studies (GWASs) have identified thousands of human genetic variants associated with diverse diseases and traits, and most of these variants map to noncoding loci with unknown target genes and function. Current approaches to understand which GWAS loci harbor causal variants and to map these noncoding regulators to target genes suffer from low throughput. With newer multiancestry GWASs from individuals of diverse ancestries, there is a pressing and growing need to scale experimental assays to connect GWAS variants with molecular mechanisms. Here, we combined biobank-scale GWASs, massively parallel CRISPR screens, and single-cell sequencing to discover target genes of noncoding variants for blood trait loci with systematic targeting and inhibition of noncoding GWAS loci with single-cell sequencing (STING-seq). RATIONALE Blood traits are highly polygenic, and GWASs have identified thousands of noncoding loci that map to candidate cis -regulatory elements (CREs). By combining CRE-silencing CRISPR perturbations and single-cell readouts, we targeted hundreds of GWAS loci in a single assay, revealing target genes in cis and in trans . For select CREs that regulate target genes, we performed direct variant insertion. Although silencing the CRE can identify the target gene, direct variant insertion can identify magnitude and direction of effect on gene expression for the GWAS variant. In select cases in which the target gene was a transcription factor or microRNA, we also investigated the gene-regulatory networks altered upon CRE perturbation and how these networks differ across blood cell types. RESULTS We inhibited candidate CREs from fine-mapped blood trait GWAS variants (from ~750,000 individual of diverse ancestries) in human erythroid progenitors. In total, we targeted 543 variants (254 loci) mapping to candidate CREs, generating multimodal single-cell data including transcriptome, direct CRISPR gRNA capture, and cell surface proteins. We identified target genes in cis (within 500 kb) for 134 CREs. In most cases, we found that the target gene was the closest gene and that specific enhancer-associated biochemical hallmarks (H3K27ac and accessible chromatin) are essential for CRE function. Using multiple perturbations at the same locus, we were able to distinguished between causal variants from noncausal variants in linkage disequilibrium. For a subset of validated CREs, we also inserted specific GWAS variants using base-editing STING-seq (beeSTING-seq) and quantified the effect size and direction of GWAS variants on gene expression. Given our transcriptome-wide data, we examined dosage effects in cis and trans in cases in which the cis target is a transcription factor or microRNA. We found that trans target genes are also enriched for GWAS loci, and identified gene clusters within trans gene networks with distinct biological functions and expression patterns in primary human blood cells. CONCLUSION In this work, we investigated noncoding GWAS variants at scale, identifying target genes in single cells. These methods can help to address the variant-to-function challenges that are a barrier for translation of GWAS findings (e.g., drug targets for diseases with a genetic basis) and greatly expand our ability to understand mechanisms underlying GWAS loci. Identifying causal variants and their target genes with STING-seq. Uncovering causal variants and their target genes or function are a major challenge for GWASs. STING-seq combines perturbation of noncoding loci with multimodal single-cell sequencing to profile hundreds of GWAS loci in parallel. This approach can identify target genes in cis and trans , measure dosage effects, and decipher gene-regulatory networks. 
    more » « less
  5. Wilson, Daniel ; Parkhill, Julian (Ed.)
    ABSTRACT A goal of modern biology is to develop the genotype-phenotype (G→P) map, a predictive understanding of how genomic information generates trait variation that forms the basis of both natural and managed communities. As microbiome research advances, however, it has become clear that many of these traits are symbiotic extended phenotypes , being governed by genetic variation encoded not only by the host’s own genome, but also by the genomes of myriad cryptic symbionts. Building a reliable G→P map therefore requires accounting for the multitude of interacting genes and even genomes involved in symbiosis. Here, we use naturally occurring genetic variation in 191 strains of the model microbial symbiont Sinorhizobium meliloti paired with two genotypes of the host Medicago truncatula in four genome-wide association studies (GWAS) to determine the genomic architecture of a key symbiotic extended phenotype— partner quality , or the fitness benefit conferred to a host by a particular symbiont genotype, within and across environmental contexts and host genotypes. We define three novel categories of loci in rhizobium genomes that must be accounted for if we want to build a reliable G→P map of partner quality; namely, (i) loci whose identities depend on the environment, (ii) those that depend on the host genotype with which rhizobia interact, and (iii) universal loci that are likely important in all or most environments. IMPORTANCE Given the rapid rise of research on how microbiomes can be harnessed to improve host health, understanding the contribution of microbial genetic variation to host phenotypic variation is pressing, and will better enable us to predict the evolution of (and select more precisely for) symbiotic extended phenotypes that impact host health. We uncover extensive context-dependency in both the identity and functions of symbiont loci that control host growth, which makes predicting the genes and pathways important for determining symbiotic outcomes under different conditions more challenging. Despite this context-dependency, we also resolve a core set of universal loci that are likely important in all or most environments, and thus, serve as excellent targets both for genetic engineering and future coevolutionary studies of symbiosis. 
    more » « less