skip to main content

Title: specificity: an R package for analysis of feature specificity to environmental and higher dimensional variables, applied to microbiome species data
Abstract Background

Understanding the factors that influence microbes’ environmental distributions is important for determining drivers of microbial community composition. These include environmental variables like temperature and pH, and higher-dimensional variables like geographic distance and host species phylogeny. In microbial ecology, “specificity” is often described in the context of symbiotic or host parasitic interactions, but specificity can be more broadly used to describe the extent to which a species occupies a narrower range of an environmental variable than expected by chance. Using a standardization we describe here, Rao’s (Theor Popul Biol, 1982., Sankhya A, 2010. ) Quadratic Entropy can be conveniently applied to calculate specificity of a feature, such as a species, to many different environmental variables.


We present our R packagespecificityfor performing the above analyses, and apply it to four real-life microbial data sets to demonstrate its application. We found that many fungi within the leaves of native Hawaiian plants had strong specificity to rainfall and elevation, even though these variables showed minimal importance in a previous analysis of fungal beta-diversity. In Antarctic cryoconite holes, our tool revealed that many bacteria have specificity to co-occurring algal community composition. Similarly, in the human gut microbiome, many bacteria showed specificity to the composition of bile acids. Finally, our analysis of the Earth Microbiome Project data set showed that most bacteria show strong ontological specificity to sample type. Our software performed as expected on synthetic data as well.


specificityis well-suited to analysis of microbiome data, both in synthetic test cases, and across multiple environment types and experimental designs. The analysis and software we present here can reveal patterns in microbial taxa that may not be evident from a community-level perspective. These insights can also be visualized and interactively shared among researchers usingspecificity’s companion package,specificity.shiny.

more » « less
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
Springer Science + Business Media
Date Published:
Journal Name:
Environmental Microbiome
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Arias, Renee S. (Ed.)

    Due to climate change, drought frequencies and severities are predicted to increase across the United States. Plant responses and adaptation to stresses depend on plant genetic and environmental factors. Understanding the effect of those factors on plant performance is required to predict species’ responses to environmental change. We used reciprocal gardens planted with distinct regional ecotypes of the perennial grassAndropogon gerardiiadapted to dry, mesic, and wet environments to characterize their rhizosphere communities using 16S rRNA metabarcode sequencing. Even though the local microbial pool was the main driver of these rhizosphere communities, the significant plant ecotypic effect highlighted active microbial recruitment in the rhizosphere, driven by ecotype or plant genetic background. Our data also suggest that ecotypes planted at their homesites were more successful in recruiting rhizosphere community members that were unique to the location. The link between the plants’ homesite and the specific local microbes supported the “home field advantage” hypothesis. The unique homesite microbes may represent microbial specialists that are linked to plant stress responses. Furthermore, our data support ecotypic variation in the recruitment of congeneric but distinct bacterial variants, highlighting the nuanced plant ecotype effects on rhizosphere microbiome recruitment. These results improve our understanding of the complex plant host–soil microbe interactions and should facilitate further studies focused on exploring the functional potential of recruited microbes. Our study has the potential to aid in predicting grassland ecosystem responses to climate change and impact restoration management practices to promote grassland sustainability.


    In this study, we used reciprocal gardens located across a steep precipitation gradient to characterize rhizosphere communities of distinct dry, mesic, and wet regional ecotypes of the perennial grassAndropogon gerardii. We used 16S rRNA amplicon sequencing and focused oligotyping analysis and showed that even though location was the main driver of the microbial communities, ecotypes could potentially recruit distinct bacterial populations. We showed that differentA. gerardiiecotypes were more successful in overall community recruitment and recruitment of microbes unique to the “home” environment, when growing at their “home site.” We found evidence for “home-field advantage” interactions between the host and host–root-associated bacterial communities, and the capability of ecotypes to recruit specialized microbes that were potentially linked to plant stress responses. Our study aids in a better understanding of the factors that affect plant adaptation, improve management strategies, and predict grassland function under the changing climate.

    more » « less
  2. Abstract Background

    Root and soil microbial communities constitute the below-ground plant microbiome, are drivers of nutrient cycling, and affect plant productivity. However, our understanding of their spatiotemporal patterns is confounded by exogenous factors that covary spatially, such as changes in host plant species, climate, and edaphic factors. These spatiotemporal patterns likely differ across microbiome domains (bacteria and fungi) and niches (root vs. soil).


    To capture spatial patterns at a regional scale, we sampled the below-ground microbiome of switchgrass monocultures of five sites spanning > 3 degrees of latitude within the Great Lakes region. To capture temporal patterns, we sampled the below-ground microbiome across the growing season within a single site. We compared the strength of spatiotemporal factors to nitrogen addition determining the major drivers in our perennial cropping system. All microbial communities were most strongly structured by sampling site, though collection date also had strong effects; in contrast, nitrogen addition had little to no effect on communities. Though all microbial communities were found to have significant spatiotemporal patterns, sampling site and collection date better explained bacterial than fungal community structure, which appeared more defined by stochastic processes. Root communities, especially bacterial, were more temporally structured than soil communities which were more spatially structured, both across and within sampling sites. Finally, we characterized a core set of taxa in the switchgrass microbiome that persists across space and time. These core taxa represented < 6% of total species richness but > 27% of relative abundance, with potential nitrogen fixing bacteria and fungal mutualists dominating the root community and saprotrophs dominating the soil community.


    Our results highlight the dynamic variability of plant microbiome composition and assembly across space and time, even within a single variety of a plant species. Root and soil fungal community compositions appeared spatiotemporally paired, while root and soil bacterial communities showed a temporal lag in compositional similarity suggesting active recruitment of soil bacteria into the root niche throughout the growing season. A better understanding of the drivers of these differential responses to space and time may improve our ability to predict microbial community structure and function under novel conditions.

    more » « less

    Host-associated microbial communities can influence physiological processes of macroorganisms, including contributing to infectious disease resistance. For instance, some bacteria that live on amphibian skin produce antifungal compounds that inhibit two lethal fungal pathogens, Batrachochytrium dendrobatidis (Bd) and Batrachochytrium salamandrivorans (Bsal). Therefore, differences in microbiome composition among host species or populations within a species can contribute to variation in susceptibility to Bd/Bsal. This study applies 16S rRNA sequencing to characterize the skin bacterial microbiomes of three widespread terrestrial salamander genera native to the western United States. Using a metacommunity structure analysis, we identified dispersal barriers for these influential bacteria between salamander families and localities. We also analysed the effects of habitat characteristics such as percent natural cover and temperature seasonality on the microbiome. We found that certain environmental variables may influence the skin microbial communities of some salamander genera more strongly than others. Each salamander family had a somewhat distinct community of putative anti-Bd skin bacteria, suggesting that salamanders may select for a functional assembly of cutaneous symbionts that could differ in its ability to protect these amphibians from disease. Our observations raise the need to consider host identity and environmental heterogeneity during the selection of probiotics to treat wildlife diseases.

    more » « less
  4. Abstract Background

    Stable isotope probing (SIP) approaches are a critical tool in microbiome research to determine associations between species and substrates, as well as the activity of species. The application of these approaches ranges from studying microbial communities important for global biogeochemical cycling to host-microbiota interactions in the intestinal tract. Current SIP approaches, such as DNA-SIP or nanoSIMS allow to analyze incorporation of stable isotopes with high coverage of taxa in a community and at the single cell level, respectively, however they are limited in terms of sensitivity, resolution or throughput.


    Here, we present an ultra-sensitive, high-throughput protein-based stable isotope probing approach (Protein-SIP), which cuts cost for labeled substrates by 50–99% as compared to other SIP and Protein-SIP approaches and thus enables isotope labeling experiments on much larger scales and with higher replication. The approach allows for the determination of isotope incorporation into microbiome members with species level resolution using standard metaproteomics liquid chromatography-tandem mass spectrometry (LC–MS/MS) measurements. At the core of the approach are new algorithms to analyze the data, which have been implemented in an open-source software ( We demonstrate sensitivity, precision and accuracy using bacterial cultures and mock communities with different labeling schemes. Furthermore, we benchmark our approach against two existing Protein-SIP approaches and show that in the low labeling range used our approach is the most sensitive and accurate. Finally, we measure translational activity using18O heavy water labeling in a 63-species community derived from human fecal samples grown on media simulating two different diets. Activity could be quantified on average for 27 species per sample, with 9 species showing significantly higher activity on a high protein diet, as compared to a high fiber diet. Surprisingly, among the species with increased activity on high protein were severalBacteroidesspecies known as fiber consumers. Apparently, protein supply is a critical consideration when assessing growth of intestinal microbes on fiber, including fiber-based prebiotics.


    We demonstrate that our Protein-SIP approach allows for the ultra-sensitive (0.01 to 10% label) detection of stable isotopes of elements found in proteins, using standard metaproteomics data.

    more » « less
  5. Abstract Background

    Microbiomes are now recognized as the main drivers of ecosystem function ranging from the oceans and soils to humans and bioreactors. However, a grand challenge in microbiome science is to characterize and quantify the chemical currencies of organic matter (i.e., metabolites) that microbes respond to and alter. Critical to this has been the development of Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR MS), which has drastically increased molecular characterization of complex organic matter samples, but challenges users with hundreds of millions of data points where readily available, user-friendly, and customizable software tools are lacking.


    Here, we build on years of analytical experience with diverse sample types to develop MetaboDirect, an open-source, command-line-based pipeline for the analysis (e.g., chemodiversity analysis, multivariate statistics), visualization (e.g., Van Krevelen diagrams, elemental and molecular class composition plots), and presentation of direct injection high-resolution FT-ICR MS data sets after molecular formula assignment has been performed. When compared to other available FT-ICR MS software, MetaboDirect is superior in that it requires a single line of code to launch a fully automated framework for the generation and visualization of a wide range of plots, with minimal coding experience required. Among the tools evaluated, MetaboDirect is also uniquely able to automatically generate biochemical transformation networks (ab initio) based on mass differences (mass difference network-based approach) that provide an experimental assessment of metabolite connections within a given sample or a complex metabolic system, thereby providing important information about the nature of the samples and the set of microbial reactions or pathways that gave rise to them. Finally, for more experienced users, MetaboDirect allows users to customize plots, outputs, and analyses.


    Application of MetaboDirect to FT-ICR MS-based metabolomic data sets from a marine phage-bacterial infection experiment and aSphagnumleachate microbiome incubation experiment showcase the exploration capabilities of the pipeline that will enable the research community to evaluate and interpret their data in greater depth and in less time. It will further advance our knowledge of how microbial communities influence and are influenced by the chemical makeup of the surrounding system. The source code and User’s guide of MetaboDirect are freely available through ( and (, respectively.

    more » « less