skip to main content


Title: Few Fixed Variants between Trophic Specialist Pupfish Species Reveal Candidate Cis -Regulatory Alleles Underlying Rapid Craniofacial Divergence
Abstract Investigating closely related species that rapidly evolved divergent feeding morphology is a powerful approach to identify genetic variation underlying variation in complex traits. This can also lead to the discovery of novel candidate genes influencing natural and clinical variation in human craniofacial phenotypes. We combined whole-genome resequencing of 258 individuals with 50 transcriptomes to identify candidate cis-acting genetic variation underlying rapidly evolving craniofacial phenotypes within an adaptive radiation of Cyprinodon pupfishes. This radiation consists of a dietary generalist species and two derived trophic niche specialists—a molluscivore and a scale-eating species. Despite extensive morphological divergence, these species only diverged 10 kya and produce fertile hybrids in the laboratory. Out of 9.3 million genome-wide SNPs and 80,012 structural variants, we found very few alleles fixed between species—only 157 SNPs and 87 deletions. Comparing gene expression across 38 purebred F1 offspring sampled at three early developmental stages, we identified 17 fixed variants within 10 kb of 12 genes that were highly differentially expressed between species. By measuring allele-specific expression in F1 hybrids from multiple crosses, we found that the majority of expression divergence between species was explained by trans-regulatory mechanisms. We also found strong evidence for two cis-regulatory alleles affecting expression divergence of two genes with putative effects on skeletal development (dync2li1 and pycr3). These results suggest that SNPs and structural variants contribute to the evolution of novel traits and highlight the utility of the San Salvador Island pupfish system as an evolutionary model for craniofacial development.  more » « less
Award ID(s):
1938571
NSF-PAR ID:
10276284
Author(s) / Creator(s):
;
Editor(s):
Wittkopp, Patricia
Date Published:
Journal Name:
Molecular Biology and Evolution
Volume:
38
Issue:
2
ISSN:
1537-1719
Page Range / eLocation ID:
405 to 423
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. ABSTRACT Genome-wide association studies (GWAS) can identify genetic variants responsible for naturally occurring and quantitative phenotypic variation. Association studies therefore provide a powerful complement to approaches that rely on de novo mutations for characterizing gene function. Although bacteria should be amenable to GWAS, few GWAS have been conducted on bacteria, and the extent to which nonindependence among genomic variants (e.g., linkage disequilibrium [LD]) and the genetic architecture of phenotypic traits will affect GWAS performance is unclear. We apply association analyses to identify candidate genes underlying variation in 20 biochemical, growth, and symbiotic phenotypes among 153 strains of Ensifer meliloti . For 11 traits, we find genotype-phenotype associations that are stronger than expected by chance, with the candidates in relatively small linkage groups, indicating that LD does not preclude resolving association candidates to relatively small genomic regions. The significant candidates show an enrichment for nucleotide polymorphisms (SNPs) over gene presence-absence variation (PAV), and for five traits, candidates are enriched in large linkage groups, a possible signature of epistasis. Many of the variants most strongly associated with symbiosis phenotypes were in genes previously identified as being involved in nitrogen fixation or nodulation. For other traits, apparently strong associations were not stronger than the range of associations detected in permuted data. In sum, our data show that GWAS in bacteria may be a powerful tool for characterizing genetic architecture and identifying genes responsible for phenotypic variation. However, careful evaluation of candidates is necessary to avoid false signals of association. IMPORTANCE Genome-wide association analyses are a powerful approach for identifying gene function. These analyses are becoming commonplace in studies of humans, domesticated animals, and crop plants but have rarely been conducted in bacteria. We applied association analyses to 20 traits measured in Ensifer meliloti , an agriculturally and ecologically important bacterium because it fixes nitrogen when in symbiosis with leguminous plants. We identified candidate alleles and gene presence-absence variants underlying variation in symbiosis traits, antibiotic resistance, and use of various carbon sources; some of these candidates are in genes previously known to affect these traits whereas others were in genes that have not been well characterized. Our results point to the potential power of association analyses in bacteria, but also to the need to carefully evaluate the potential for false associations. 
    more » « less
  2. Abstract

    Sex determination, the developmental process by which sexually dimorphic phenotypes are established, evolves fast. Evolutionary turnover in a sex determination pathway may occur via selection on alleles that are genetically linked to a new master sex determining locus on a newly formed proto‐sex chromosome. Species with polygenic sex determination, in which master regulatory genes are found on multiple different proto‐sex chromosomes, are informative models to study the evolution of sex determination and sex chromosomes. House flies are such a model system, with male determining loci possible on all six chromosomes and a female‐determiner on one of the chromosomes as well. The two most common male‐determining proto‐Y chromosomes form latitudinal clines on multiple continents, suggesting that temperature variation is an important selection pressure responsible for maintaining polygenic sex determination in this species. Temperature‐dependent fitness effects could be manifested through temperature‐dependent gene expression differences across proto‐Y chromosome genotypes. These gene expression differences may be the result ofcisregulatory variants that affect the expression of genes on the proto‐sex chromosomes, ortranseffects of the proto‐Y chromosomes on genes elswhere in the genome. We used RNA‐seq to identify genes whose expression depends on proto‐Y chromosome genotype and temperature in adult male house flies. We found no evidence for ecologically meaningful temperature‐dependent expression differences of sex determining genes between male genotypes, but we were probably not sampling an appropriate developmental time‐point to identify such effects. In contrast, we identified many other genes whose expression depends on the interaction between proto‐Y chromosome genotype and temperature, including genes that encode proteins involved in reproduction, metabolism, lifespan, stress response, and immunity. Notably, genes with genotype‐by‐temperature interactions on expression were not enriched on the proto‐sex chromosomes. Moreover, there was no evidence that temperature‐dependent expression is driven by chromosome‐widecis‐regulatory divergence between the proto‐Y and proto‐X alleles. Therefore, if temperature‐dependent gene expression is responsible for differences in phenotypes and fitness of proto‐Y genotypes across house fly populations, these effects are driven by a small number of temperature‐dependent alleles on the proto‐Y chromosomes that may havetranseffects on the expression of genes on other chromosomes.

     
    more » « less
  3. Abstract

    Adaptive radiations are often characterized by the rapid evolution of traits associated with divergent feeding modes. For example, the evolutionary history of African cichlids is marked by repeated and coordinated shifts in skull, trophic, fin and body shape. Here, we seek to explore the molecular basis for fin shape variation in Lake Malawi cichlids. We first described variation within an F2mapping population derived by crossing two cichlid species with divergent morphologies including fin shape. We then used this population to genetically map loci that influence variation in this trait. We found that the genotype–phenotype map for fin shape is largely distinct from other morphological characters including body and craniofacial shape. These data suggest that key aspects of fin, body and jaw shape are genetically modular and that the coordinated evolution of these traits in cichlids is more likely due to common selective pressures than to pleiotropy or linkage. We next combined genetic mapping data with population‐level genome scans to identifywnt7aaandcol1a1as candidate genes underlying variation in the number of pectoral fin ray elements. Gene expression patterns across species with different fin morphologies and small molecule manipulation of the Wnt pathway during fin development further support the hypothesis that variation at these loci underlies divergence in fin shape between cichlid species. In all, our data provide additional insights into the genetic and molecular mechanisms associated with morphological divergence in this important adaptive radiation.

     
    more » « less
  4. Abstract

    The study of local adaptation in the presence of ongoing gene flow is the study of natural selection in action, revealing the functional genetic diversity most relevant to contemporary pressures. In addition to individual genes, genome-wide architecture can itself evolve to enable adaptation. Distributed across a steep thermal gradient along the east coast of North America, Atlantic silversides (Menidia menidia) exhibit an extraordinary degree of local adaptation in a suite of traits, and the capacity for rapid adaptation from standing genetic variation, but we know little about the patterns of genomic variation across the species range that enable this remarkable adaptability. Here, we use low-coverage, whole-transcriptome sequencing of Atlantic silversides sampled along an environmental cline to show marked signatures of divergent selection across a gradient of neutral differentiation. Atlantic silversides sampled across 1371 km of the southern section of its distribution have very low genome-wide differentiation (median FST = 0.006 across 1.9 million variants), consistent with historical connectivity and observations of recent migrants. Yet almost 14,000 single nucleotide polymorphisms (SNPs) are nearly fixed (FST > 0.95) for alternate alleles. Highly differentiated SNPs cluster into four tight linkage disequilibrium (LD) blocks that span hundreds of genes and several megabases. Variants in these LD blocks are disproportionately nonsynonymous and concentrated in genes enriched for multiple functions related to known adaptations in silversides, including variation in lipid storage, metabolic rate, and spawning behavior. Elevated levels of absolute divergence and demographic modeling suggest selection maintaining divergence across these blocks under gene flow. These findings represent an extreme case of heterogeneity in levels of differentiation across the genome, and highlight how gene flow shapes genomic architecture in continuous populations. Locally adapted alleles may be common features of populations distributed along environmental gradients, and will likely be key to conserving variation to enable future responses to environmental change.

     
    more » « less
  5. INTRODUCTION Genome-wide association studies (GWASs) have identified thousands of human genetic variants associated with diverse diseases and traits, and most of these variants map to noncoding loci with unknown target genes and function. Current approaches to understand which GWAS loci harbor causal variants and to map these noncoding regulators to target genes suffer from low throughput. With newer multiancestry GWASs from individuals of diverse ancestries, there is a pressing and growing need to scale experimental assays to connect GWAS variants with molecular mechanisms. Here, we combined biobank-scale GWASs, massively parallel CRISPR screens, and single-cell sequencing to discover target genes of noncoding variants for blood trait loci with systematic targeting and inhibition of noncoding GWAS loci with single-cell sequencing (STING-seq). RATIONALE Blood traits are highly polygenic, and GWASs have identified thousands of noncoding loci that map to candidate cis -regulatory elements (CREs). By combining CRE-silencing CRISPR perturbations and single-cell readouts, we targeted hundreds of GWAS loci in a single assay, revealing target genes in cis and in trans . For select CREs that regulate target genes, we performed direct variant insertion. Although silencing the CRE can identify the target gene, direct variant insertion can identify magnitude and direction of effect on gene expression for the GWAS variant. In select cases in which the target gene was a transcription factor or microRNA, we also investigated the gene-regulatory networks altered upon CRE perturbation and how these networks differ across blood cell types. RESULTS We inhibited candidate CREs from fine-mapped blood trait GWAS variants (from ~750,000 individual of diverse ancestries) in human erythroid progenitors. In total, we targeted 543 variants (254 loci) mapping to candidate CREs, generating multimodal single-cell data including transcriptome, direct CRISPR gRNA capture, and cell surface proteins. We identified target genes in cis (within 500 kb) for 134 CREs. In most cases, we found that the target gene was the closest gene and that specific enhancer-associated biochemical hallmarks (H3K27ac and accessible chromatin) are essential for CRE function. Using multiple perturbations at the same locus, we were able to distinguished between causal variants from noncausal variants in linkage disequilibrium. For a subset of validated CREs, we also inserted specific GWAS variants using base-editing STING-seq (beeSTING-seq) and quantified the effect size and direction of GWAS variants on gene expression. Given our transcriptome-wide data, we examined dosage effects in cis and trans in cases in which the cis target is a transcription factor or microRNA. We found that trans target genes are also enriched for GWAS loci, and identified gene clusters within trans gene networks with distinct biological functions and expression patterns in primary human blood cells. CONCLUSION In this work, we investigated noncoding GWAS variants at scale, identifying target genes in single cells. These methods can help to address the variant-to-function challenges that are a barrier for translation of GWAS findings (e.g., drug targets for diseases with a genetic basis) and greatly expand our ability to understand mechanisms underlying GWAS loci. Identifying causal variants and their target genes with STING-seq. Uncovering causal variants and their target genes or function are a major challenge for GWASs. STING-seq combines perturbation of noncoding loci with multimodal single-cell sequencing to profile hundreds of GWAS loci in parallel. This approach can identify target genes in cis and trans , measure dosage effects, and decipher gene-regulatory networks. 
    more » « less