skip to main content

Title: Comprehensive Phylogenomics of Methylobacterium Reveals Four Evolutionary Distinct Groups and Underappreciated Phyllosphere Diversity
Abstract Methylobacterium is a group of methylotrophic microbes associated with soil, fresh water, and particularly the phyllosphere, the aerial part of plants that has been well studied in terms of physiology but whose evolutionary history and taxonomy are unclear. Recent work has suggested that Methylobacterium is much more diverse than thought previously, questioning its status as an ecologically and phylogenetically coherent taxonomic genus. However, taxonomic and evolutionary studies of Methylobacterium have mostly been restricted to model species, often isolated from habitats other than the phyllosphere and have yet to utilize comprehensive phylogenomic methods to examine gene trees, gene content, or synteny. By analyzing 189 Methylobacterium genomes from a wide range of habitats, including the phyllosphere, we inferred a robust phylogenetic tree while explicitly accounting for the impact of horizontal gene transfer (HGT). We showed that Methylobacterium contains four evolutionarily distinct groups of bacteria (namely A, B, C, D), characterized by different genome size, GC content, gene content, and genome architecture, revealing the dynamic nature of Methylobacterium genomes. In addition to recovering 59 described species, we identified 45 candidate species, mostly phyllosphere-associated, stressing the significance of plants as a reservoir of Methylobacterium diversity. We inferred an ancient transition from a free-living more » lifestyle to association with plant roots in Methylobacteriaceae ancestor, followed by phyllosphere association of three of the major groups (A, B, D), whose early branching in Methylobacterium history has been heavily obscured by HGT. Together, our work lays the foundations for a thorough redefinition of Methylobacterium taxonomy, beginning with the abandonment of Methylorubrum. « less
; ; ; ; ; ; ; ; ; ; ; ;
Angert, Esther
Award ID(s):
Publication Date:
Journal Name:
Genome Biology and Evolution
Sponsoring Org:
National Science Foundation
More Like this
  1. Dinoflagellates of the family Symbiodiniaceae are crucial photosymbionts in corals and other marine organisms. Of these, Cladocopium goreaui is one of the most dominant symbiont species in the Indo-Pacific. Here, we present an improved genome assembly of C. goreaui combining new long-read sequence data with previously generated short-read data. Incorporating new full-length transcripts to guide gene prediction, the C. goreaui genome (1.2 Gb) exhibits a high extent of completeness (82.4% based on BUSCO protein recovery) and better resolution of repetitive sequence regions; 45,322 gene models were predicted, and 327 putative, topologically associated domains of the chromosomes were identified. Comparison with other Symbiodiniaceae genomes revealed a prevalence of repeats and duplicated genes in C. goreaui, and lineage-specific genes indicating functional innovation. Incorporating 2,841,408 protein sequences from 96 taxonomically diverse eukaryotes and representative prokaryotes in a phylogenomic approach, we assessed the evolutionary history of C. goreaui genes. Of the 5246 phylogenetic trees inferred from homologous protein sets containing two or more phyla, 35–36% have putatively originated via horizontal gene transfer (HGT), predominantly (19–23%) via an ancestral Archaeplastida lineage implicated in the endosymbiotic origin of plastids: 10–11% are of green algal origin, including genes encoding photosynthetic functions. Our results demonstrate the utility ofmore »long-read sequence data in resolving structural features of a dinoflagellate genome, and highlight how genetic transfer has shaped genome evolution of a facultative symbiont, and more broadly of dinoflagellates.« less
  2. Keim, Paul (Ed.)
    ABSTRACT Methylobacterium is a prevalent bacterial genus of the phyllosphere. Despite its ubiquity, little is known about the extent to which its diversity reflects neutral processes like migration and drift, versus environmental filtering of life history strategies and adaptations. In two temperate forests, we investigated how phylogenetic diversity within Methylobacterium is structured by biogeography, seasonality, and growth strategies. Using deep, culture-independent barcoded marker gene sequencing coupled with culture-based approaches, we uncovered a considerable diversity of Methylobacterium in the phyllosphere. We cultured different subsets of Methylobacterium lineages depending upon the temperature of isolation and growth (20°C or 30°C), suggesting long-term adaptation to temperature. To a lesser extent than temperature adaptation, Methylobacterium diversity was also structured across large (>100 km; between forests) and small (<1.2 km; within forests) geographical scales, among host tree species, and was dynamic over seasons. By measuring the growth of 79 isolates during different temperature treatments, we observed contrasting growth performances, with strong lineage- and season-dependent variations in growth strategies. Finally, we documented a progressive replacement of lineages with a high-yield growth strategy typical of cooperative, structured communities in favor of those characterized by rapid growth, resulting in convergence and homogenization of community structure at the end ofmore »the growing season. Together, our results show how Methylobacterium is phylogenetically structured into lineages with distinct growth strategies, which helps explain their differential abundance across regions, host tree species, and time. This work paves the way for further investigation of adaptive strategies and traits within a ubiquitous phyllosphere genus. IMPORTANCE Methylobacterium is a bacterial group tied to plants. Despite the ubiquity of methylobacteria and the importance to their hosts, little is known about the processes driving Methylobacterium community dynamics. By combining traditional culture-dependent and -independent (metabarcoding) approaches, we monitored Methylobacterium diversity in two temperate forests over a growing season. On the surface of tree leaves, we discovered remarkably diverse and dynamic Methylobacterium communities over short temporal (from June to October) and spatial (within 1.2 km) scales. Because we cultured different subsets of Methylobacterium diversity depending on the temperature of incubation, we suspected that these dynamics partly reflected climatic adaptation. By culturing strains under laboratory conditions mimicking seasonal variations, we found that diversity and environmental variations were indeed good predictors of Methylobacterium growth performances. Our findings suggest that Methylobacterium community dynamics at the surface of tree leaves results from the succession of strains with contrasting growth strategies in response to environmental variations.« less
  3. Abstract

    The Cyclophyllidea is the most diverse order of tapeworms, encompassing species that infect all classes of terrestrial tetrapods including humans and domesticated animals. Available phylogenetic reconstructions based either on morphology or molecular data lack the resolution to allow scientists to either propose a solid taxonomy or infer evolutionary associations. Molecular markers available for the Cyclophyllidea mostly include ribosomalDNAand mitochondrial loci. In this study, we identified 3641 single‐copy nuclear coding loci by comparing the genomes ofHymenolepis microstoma,Echinococcus granulosusandTaenia solium. We designedRNAbaits based on the sequence ofH. microstoma, and applied target enrichment and Illumina sequencing to test the utility of those baits to recover loci useful for phylogenetic analyses. We capturedDNAfrom five species of tapeworms representing two families of cyclophyllideans. We obtained an average of 3284 (90%) of the targets from the test samples and then used captured sequences (2 181 361 bp in total; fragment size ranging from 301 to 6969 bp) to reconstruct a phylogeny for the five test species plus the three species for which genomic data are available. The results were consistent with the current consensus regarding cyclophyllidean relationships. To assess the potential for our method to yield informative genetic variation at intraspecific scales, we extracted 14 074 single nucleotidemore »polymorphisms (SNPs) from alignments of fourArostrilepis macrocirrosaand twoA. cookiand successfully inferred their relationships. The results showed that our target gene tools yield data sets that provide robust inferences at a range of taxonomic scales in the Cyclophyllidea.

    « less
  4. Abstract

    Prokaryotic genomes are often considered to be mosaics of genes that do not necessarily share the same evolutionary history due to widespread horizontal gene transfers (HGTs). Consequently, representing evolutionary relationships of prokaryotes as bifurcating trees has long been controversial. However, studies reporting conflicts among gene trees derived from phylogenomic data sets have shown that these conflicts can be the result of artifacts or evolutionary processes other than HGT, such as incomplete lineage sorting, low phylogenetic signal, and systematic errors due to substitution model misspecification. Here, we present the results of an extensive exploration of phylogenetic conflicts in the cyanobacterial order Nostocales, for which previous studies have inferred strongly supported conflicting relationships when using different concatenated phylogenomic data sets. We found that most of these conflicts are concentrated in deep clusters of short internodes of the Nostocales phylogeny, where the great majority of individual genes have low resolving power. We then inferred phylogenetic networks to detect HGT events while also accounting for incomplete lineage sorting. Our results indicate that most conflicts among gene trees are likely due to incomplete lineage sorting linked to an ancient rapid radiation, rather than to HGTs. Moreover, the short internodes of this radiation fit themore »expectations of the anomaly zone, i.e., a region of the tree parameter space where a species tree is discordant with its most likely gene tree. We demonstrated that concatenation of different sets of loci can recover up to 17 distinct and well-supported relationships within the putative anomaly zone of Nostocales, corresponding to the observed conflicts among well-supported trees based on concatenated data sets from previous studies. Our findings highlight the important role of rapid radiations as a potential cause of strongly conflicting phylogenetic relationships when using phylogenomic data sets of bacteria. We propose that polytomies may be the most appropriate phylogenetic representation of these rapid radiations that are part of anomaly zones, especially when all possible genomic markers have been considered to infer these phylogenies. [Anomaly zone; bacteria; horizontal gene transfer; incomplete lineage sorting; Nostocales; phylogenomic conflict; rapid radiation; Rhizonema.]

    « less
  5. All life on earth is linked by a shared evolutionary history. Even before Darwin developed the theory of evolution, Linnaeus categorized types of organisms based on their shared traits. We now know these traits derived from these species’ shared ancestry. This evolutionary history provides a natural framework to harness the enormous quantities of biological data being generated today. The Open Tree of Life project is a collaboration developing tools to curate and share evolutionary estimates (phylogenies) covering the entire tree of life (Hinchliff et al. 2015, McTavish et al. 2017). The tree is viewable at, and the data is all freely available online. The taxon identifiers used in the Open Tree unified taxonomy (Rees and Cranston 2017) are mapped to identifiers across biological informatics databases, including the Global Biodiversity Information Facility (GBIF), NCBI, and others. Linking these identifiers allows researchers to easily unify data from across these different resources (Fig. 1). Leveraging a unified evolutionary framework across the diversity of life provides new avenues for integrative wide scale research. Downstream tools, such as R packages developed by the R OpenSci foundation (rotl, rgbif) (Michonneau et al. 2016, Chamberlain 2017) and others tools (Revell 2012), make accessing and combining thismore »information straightforward for students as well as researchers (e.g. Figure 1. Example linking phylogenetic relationships accessed from the Open Tree of Life with specimen location data from Global Biodiversity Information Facility. For example, a recent publication by Santorelli et al. 2018 linked evolutionary information from Open Tree with species locality data gathered from a local field study as well as GBIF species location records to test a river-barrier hypothesis in the Amazon. By combining these data, the authors were able test a widely held biogeographic hypothesis across 1952 species in 14 taxonomic groups, and found that a river that had been postulated to drive endemism, was in fact not a barrier to gene flow. However, data provenance and taxonomic name reconciliation remain key hurdles to applying data from these large digital biodiversity and evolution community resources to answering biological questions. In the Amazonian river analysis, while they leveraged use of GBIF records as a secondary check on their species records, they relied on their an intensive local field study for their major conclusions, and preferred taxon specific phylogenetic resources over Open Tree where they were available (Santorelli et al. 2018). When Li et al. 2018 assessed large scale phylogenetic approaches, including Open Tree, for measuring community diversity, they found that synthesis phylogenies were less resolved than purpose-built phylogenies, but also found that these synthetic phylogenies were sufficient for community level phylogenetic diversity analyses. Nonetheless, data quality concerns have limited adoption of analyses data from centralized resources (McTavish et al. 2017). Taxonomic name recognition and reconciliation across databases also remains a hurdle for large scale analyses, despite several ongoing efforts to improve taxonomic interoperability and unify taxonomies, such at Catalogue of Life + (Bánki et al. 2018). In order to support innovative science, large scale digital data resources need to facilitate data linkage between resources, and address researchers' data quality and provenance concerns. I will present the model that the Open Tree of Life is using to provide evolutionary data at the scale of the entire tree of life, while maintaining traceable provenance to the publications and taxonomies these evolutionary relationships are inferred from. I will discuss the hurdles to adoption of these large scale resources by researchers, as well as the opportunities for new research avenues provided by the connections between evolutionary inferences and biodiversity digital databases.« less