skip to main content

Title: Genome-Wide Association Analyses in the Model Rhizobium Ensifer meliloti
ABSTRACT Genome-wide association studies (GWAS) can identify genetic variants responsible for naturally occurring and quantitative phenotypic variation. Association studies therefore provide a powerful complement to approaches that rely on de novo mutations for characterizing gene function. Although bacteria should be amenable to GWAS, few GWAS have been conducted on bacteria, and the extent to which nonindependence among genomic variants (e.g., linkage disequilibrium [LD]) and the genetic architecture of phenotypic traits will affect GWAS performance is unclear. We apply association analyses to identify candidate genes underlying variation in 20 biochemical, growth, and symbiotic phenotypes among 153 strains of Ensifer meliloti . For 11 traits, we find genotype-phenotype associations that are stronger than expected by chance, with the candidates in relatively small linkage groups, indicating that LD does not preclude resolving association candidates to relatively small genomic regions. The significant candidates show an enrichment for nucleotide polymorphisms (SNPs) over gene presence-absence variation (PAV), and for five traits, candidates are enriched in large linkage groups, a possible signature of epistasis. Many of the variants most strongly associated with symbiosis phenotypes were in genes previously identified as being involved in nitrogen fixation or nodulation. For other traits, apparently strong associations were not stronger than more » the range of associations detected in permuted data. In sum, our data show that GWAS in bacteria may be a powerful tool for characterizing genetic architecture and identifying genes responsible for phenotypic variation. However, careful evaluation of candidates is necessary to avoid false signals of association. IMPORTANCE Genome-wide association analyses are a powerful approach for identifying gene function. These analyses are becoming commonplace in studies of humans, domesticated animals, and crop plants but have rarely been conducted in bacteria. We applied association analyses to 20 traits measured in Ensifer meliloti , an agriculturally and ecologically important bacterium because it fixes nitrogen when in symbiosis with leguminous plants. We identified candidate alleles and gene presence-absence variants underlying variation in symbiosis traits, antibiotic resistance, and use of various carbon sources; some of these candidates are in genes previously known to affect these traits whereas others were in genes that have not been well characterized. Our results point to the potential power of association analyses in bacteria, but also to the need to carefully evaluate the potential for false associations. « less
; ; ; ; ; ; ; ; ;
Award ID(s):
1724993 1237993
Publication Date:
Journal Name:
Sponsoring Org:
National Science Foundation
More Like this
  1. Tomato (Solanum lycopersicum L.) is a widely used model plant species for dissecting out the genomic bases of complex traits to thus provide an optimal platform for modern “-omics” studies and genome-guided breeding. Genome-wide association studies (GWAS) have become a preferred approach for screening large diverse populations and many traits. Here, we present GWAS analysis of a collection of 115 landraces and 11 vintage and modern cultivars. A total of 26 conventional descriptors, 40 traits obtained by digital phenotyping, the fruit content of six carotenoids recorded at the early ripening (breaker) and red-ripe stages and 21 climate-related variables were analyzed in the context of genetic diversity monitored in the 126 accessions. The data obtained from thorough phenotyping and the SNP diversity revealed by sequencing of ripe fruit transcripts of 120 of the tomato accessions were jointly analyzed to determine which genomic regions are implicated in the expressed phenotypic variation. This study reveals that the use of fruit RNA-Seq SNP diversity is effective not only for identification of genomic regions that underlie variation in fruit traits, but also of variation related to additional plant traits and adaptive responses to climate variation. These results allowed validation of our approach because different marker-traitmore »associations mapped on chromosomal regions where other candidate genes for the same traits were previously reported. In addition, previously uncharacterized chromosomal regions were targeted as potentially involved in the expression of variable phenotypes, thus demonstrating that our tomato collection is a precious reservoir of diversity and an excellent tool for gene discovery.« less
  2. Abstract

    Classical genetic studies have identified many cases of pleiotropy where mutations in individual genes alter many different phenotypes. Quantitative genetic studies of natural genetic variants frequently examine one or a few traits, limiting their potential to identify pleiotropic effects of natural genetic variants. Widely adopted community association panels have been employed by plant genetics communities to study the genetic basis of naturally occurring phenotypic variation in a wide range of traits. High-density genetic marker data—18M markers—from 2 partially overlapping maize association panels comprising 1,014 unique genotypes grown in field trials across at least 7 US states and scored for 162 distinct trait data sets enabled the identification of of 2,154 suggestive marker-trait associations and 697 confident associations in the maize genome using a resampling-based genome-wide association strategy. The precision of individual marker-trait associations was estimated to be 3 genes based on a reference set of genes with known phenotypes. Examples were observed of both genetic loci associated with variation in diverse traits (e.g., above-ground and below-ground traits), as well as individual loci associated with the same or similar traits across diverse environments. Many significant signals are located near genes whose functions were previously entirely unknown or estimated purely viamore »functional data on homologs. This study demonstrates the potential of mining community association panel data using new higher-density genetic marker sets combined with resampling-based genome-wide association tests to develop testable hypotheses about gene functions, identify potential pleiotropic effects of natural genetic variants, and study genotype-by-environment interaction.

    « less
  3. Gilbert, Jack A. (Ed.)
    ABSTRACT Host association—the selective adaptation of pathogens to specific host species—evolves through constant interactions between host and pathogens, leaving a lot yet to be discovered on immunological mechanisms and genomic determinants. The causative agents of Lyme disease (LD) are spirochete bacteria composed of multiple species of the Borrelia burgdorferi sensu lato complex, including B. burgdorferi ( Bb ), the main LD pathogen in North America—a useful model for the study of mechanisms underlying host-pathogen association. Host adaptation requires pathogens’ ability to evade host immune responses, such as complement, the first-line innate immune defense mechanism. We tested the hypothesis that different host-adapted phenotypes among Bb strains are linked to polymorphic loci that confer complement evasion traits in a host-specific manner. We first examined the survivability of 20 Bb strains in sera in vitro and/or bloodstream and tissues in vivo from rodent and avian LD models. Three groups of complement-dependent host-association phenotypes emerged. We analyzed complement-evasion genes, identified a priori among all strains and sequenced and compared genomes for individual strains representing each phenotype. The evolutionary history of ospC loci is correlated with host-specific complement-evasion phenotypes, while comparative genomics suggests that several gene families and loci are potentially involved in host association.more »This multidisciplinary work provides novel insights into the functional evolution of host-adapted phenotypes, building a foundation for further investigation of the immunological and genomic determinants of host association. IMPORTANCE Host association is the phenotype that is commonly found in many pathogens that preferential survive in particular hosts. The Lyme disease (LD)-causing agent, B. burgdorferi ( Bb ), is an ideal model to study host association, as Bb is mainly maintained in nature through rodent and avian hosts. A widespread yet untested concept posits that host association in Bb strains is linked to Bb functional genetic variation conferring evasion to complement, an innate defense mechanism in vertebrate sera. Here, we tested this concept by grouping 20 Bb strains into three complement-dependent host-association phenotypes based on their survivability in sera and/or bloodstream and distal tissues in rodent and avian LD models. Phylogenomic analysis of these strains further correlated several gene families and loci, including ospC , with host-specific complement-evasion phenotypes. Such multifaceted studies thus pave the road to further identify the determinants of host association, providing mechanistic insights into host-pathogen interaction.« less
  4. Nielsen, Rasmus (Ed.)
    Abstract Population genomic analyses of high-altitude humans and other vertebrates have identified numerous candidate genes for hypoxia adaptation, and the physiological pathways implicated by such analyses suggest testable hypotheses about underlying mechanisms. Studies of highland natives that integrate genomic data with experimental measures of physiological performance capacities and subordinate traits are revealing associations between genotypes (e.g., hypoxia-inducible factor gene variants) and hypoxia-responsive phenotypes. The subsequent search for causal mechanisms is complicated by the fact that observed genotypic associations with hypoxia-induced phenotypes may reflect second-order consequences of selection-mediated changes in other (unmeasured) traits that are coupled with the focal trait via feedback regulation. Manipulative experiments to decipher circuits of feedback control and patterns of phenotypic integration can help identify causal relationships that underlie observed genotype–phenotype associations. Such experiments are critical for correct inferences about phenotypic targets of selection and mechanisms of adaptation.
  5. Quantification of the simultaneous contributions of loci to multiple traits, a phenomenon called pleiotropy, is facilitated by the increased availability of high-throughput genotypic and phenotypic data. To understand the prevalence and nature of pleiotropy, the ability of multivariate and univariate genome-wide association study (GWAS) models to distinguish between pleiotropic and non-pleiotropic loci in linkage disequilibrium (LD) first needs to be evaluated. Therefore, we used publicly available maize and soybean genotypic data to simulate multiple pairs of traits that were either (i) controlled by quantitative trait nucleotides (QTNs) on separate chromosomes, (ii) controlled by QTNs in various degrees of LD with each other, or (iii) controlled by a single pleiotropic QTN. We showed that multivariate GWAS could not distinguish between QTNs in LD and a single pleiotropic QTN. In contrast, a unique QTN detection rate pattern was observed for univariate GWAS whenever the simulated QTNs were in high LD or pleiotropic. Collectively, these results suggest that multivariate and univariate GWAS should both be used to infer whether or not causal mutations underlying peak GWAS associations are pleiotropic. Therefore, we recommend that future studies use a combination of multivariate and univariate GWAS models, as both models could be useful for identifying andmore »narrowing down candidate loci with potential pleiotropic effects for downstream biological experiments.« less