skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Dataset: Coral high molecular weight carbohydrates support opportunistic microbes in bacterioplankton from an algae-dominated reef
This dataset contains raw data for figures 5 (genus-level microbial community compositions) and 6 (predicted metabolic functions, pathway types), R code for PERMANOVAs (Table 3), DESeq2 and random forest (rfpermute) analyses, and R code to generate figures 5, 6b, S5 & S6. Overview of .txt files: Genus_16S_Counts.txt Counts data used for DESeq2 analysis (Fig. 5c). Genus_16S_relAbund.txt Relative abundance data used for Fig. 5a, b & d. MicFunPred_MetaCyc_types_all Predicted pathway abundance data for all pathway types used for DESeq2 (Fig. 6b), PERMANOVA (Table 3) and column clustering of Fig. 6b. MicFunPred_MetaCyc_AA_types.txt Amino acids (Fig. 6b) MicFunPred_MetaCyc_CH_types.txt Carbohydrates (Fig. 6b) MicFunPred_MetaCyc_EM _types.txt Energy metabolism (Fig. 6b) MicFunPred_MetaCyc_FAL _types.txt Fatty acids and lipids (Fig. 6b) MicFunPred_MetaCyc_SM _types.txt Secondary metabolism (Fig. 6b) MicFunPred_MetaCyc_OBiosyn _types.txt Other biosynthesis (Fig. S6) MicFunPred_MetaCyc_ODeg _types.txt Other degradation (Fig. S6)  more » « less
Award ID(s):
2023298
PAR ID:
10662925
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ;
Publisher / Repository:
Zenodo
Date Published:
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Authors: Wesley J. Sparagon, Milou G.I. Arts, Zachary Quinlan3, Irina Koester, Jacqueline Comstock, Jessica A. Bullington, Craig A. Carlson, Pieter C. Dorrestein, Lihini I. Aluwihare, Linda Wegley Kelly, Andreas F. Haas and Craig E. Nelson Contains relevant raw data, R Code for analysis and figure generation, and output data frames used for figures. 
    more » « less
  2. DAMP21ka.nc: NetCDF file containing the model prior, proxy values, and DAMP21ka reconstruction for lake status, precipitation, and temperature variables.\n\nclhancock/DAMP21ka-v1.0.0.zip: Notebooks used to generate figures for Hancock et al. (2024)\n\nHolocene-code_development_hydroclimate.zip: Code used to generate the DAMP21ka reconstruction \n\n \n\nHancock, C. L., Erb, M. P., McKay, N. P., Dee, S. G., and Ivanovic, R.: A global Data Assimilation of Moisture Patterns from 21,000–0 BP (DAMP-21ka) using lake level proxy records" 
    more » « less
  3. This dataset includes statistically resampled monthly time series data of Arctic sea ice area and gridded data for March and September for sea ice concentration for a selection of large ensemble climate models and observational datasets. Arctic sea ice concentrations and areas are resampled from all available members of six coupled climate models from the Coupled Model Intercomparison Project 5 (CMIP5). These six models are: The second generation Canadian Earth System Model (CanESM2), The Community Earth System Mode version 1 (CESM1), The Commonwealth Scientific and Industrial Research Organisation Global Climate Model Mark 3.6 (CSIRO MK3.6), The Geophysical Fluid Dynamics Laboratory Coupled Climate Model version 3 (GFDL CM3), Geophysical Fluid Dynamics Laboratory Earth System Model version 2 with Modular Ocean Model version 4.1 (GFDL ESM2M), Max Planck Institute Earth System Model version 1 (MPI ESM1). The Four observational datasets are The Hadley Centre Sea Ice and Sea Surface Temperature data set version 1 (HadISST1), The National Oceanic and Atmospheric Administration and National Snow and Ice Data Center Climate Data Record Version 4 (CDR), The The National Aeronautics and Space Administration Team Algorithm (NT), and the The National Aeronautics and Space Administration Bootstrap Team Algorithm (BT). The sea ice area data is resampled 10,000 times and then the standard deviation of those resamplings is calculated, which can be considered analagous to interannual variability of sea ice area (SIA). The standard deviation (sigma) and mean (mu) of these data represent the variability and typical values respectively of interannual variability found in each ensemble member or observational dataset. Sea ice concentration is resampled 1000 times with the same standard deviation and mean metrics for sea ice concentration. This dataset was created to evaluate climate model projections of Arctic sea ice interannual variability and is used in the article Wyburn-Powell, Jahn, England (2022), Modeled Interannual Variability of Arctic Sea Ice Cover is Within Observational Uncertainty, Journal of Climate, https://doi.org/10.1175/JCLI-D-21-0958.1. This work was conducted at the University of Colorado Boulder from 2020-2022. The figures from the Journal of Climate article can be reproduced from the following datasets. The code used to create the datasets can be located at https://www.doi.org/10.5281/zenodo.6687725. - Figure 1: Sigma_obs_SIA.nc - Figure 2: Sigma_obs_SIA.nc, Mu_obs_SIA.nc, Sigma_mem_SIA.nc, Mu_mem_SIA.nc - Figure 3: Sigma_mem_varying_time_periods_1965_2066_03.nc, Sigma_LE_varying_time_periods_1965_2066_03.nc, Sigma_LE_varying_time_periods_1970_2040_09.nc, Sigma_obs_varying_time_periods_1953_2020.nc - Figure 4: Sigma_obs_SIA.nc, Sigma_mem_SIA.nc - Figure 5: Sigma_obs_SIA.nc - Figure 6: <model_name>_resampled_0<month>_individual.nc, <observational_dataset>_resampled_individual_1979_2020_03_09.nc - Figure 7: Sigma_obs_SIA.nc, Mu_obs_SIA.nc, Sigma_mem_SIA.nc, Mu_mem_SIA.nc - Figure 8: <model_name>_resampled_0<month>_individual.nc, <observational_dataset>_resampled_individual_1979_2020_03_09.nc - Figure 9: Sigma_mem_SIA.nc, Sigma_LE_SIA.nc 
    more » « less
  4. {"Abstract":["Binder is a publicly accessible online service for executing interactive notebooks based on Git repositories. Binder dynamically builds and deploys containers following a recipe stored in the repository, then gives the user a browser-based notebook interface. The Binder group periodically releases a log of container launches from the public Binder service. Archives of launch records are available here. These records do not include identifiable information like IP addresses, but do give the source repo being launched along with some other metadata. The main content of this dataset is in the binder.sqlite<\/code> file. This SQLite database includes launch records from 2018-11-03 to 2021-06-06 in the events<\/code> table, which has the following schema.<\/p>\n\nCREATE TABLE events(\n version INTEGER,\n timestamp TEXT,\n provider TEXT,\n spec TEXT,\n origin TEXT,\n ref TEXT,\n guessed_ref TEXT\n);\nCREATE INDEX idx_timestamp ON events(timestamp);\n<\/code>\n\nversion<\/code> indicates the version of the record as assigned by Binder. The origin<\/code> field became available with version 3, and the ref<\/code> field with version 4. Older records where this information was not recorded will have the corresponding fields set to null.<\/li>timestamp<\/code> is the ISO timestamp of the launch<\/li>provider<\/code> gives the type of source repo being launched ("GitHub" is by far the most common). The rest of the explanations assume GitHub, other providers may differ.<\/li>spec<\/code> gives the particular branch/release/commit being built. It consists of <github-id>/<repo>/<branch><\/code>.<\/li>origin<\/code> indicates which backend was used. Each has its own storage, compute, etc. so this info might be important for evaluating caching and performance. Note that only recent records include this field. May be null.<\/li>ref<\/code> specifies the git commit that was actually used, rather than the named branch referenced by spec<\/code>. Note that this was not recorded from the beginning, so only the more recent entries include it. May be null.<\/li>For records where ref<\/code> is not available, we attempted to clone the named reference given by spec<\/code> rather than the specific commit (see below). The guessed_ref<\/code> field records the commit found at the time of cloning. If the branch was updated since the container was launched, this will not be the exact version that was used, and instead will refer to whatever was available at the time (early 2021). Depending on the application, this might still be useful information. Selecting only records with version 4 (or non-null ref<\/code>) will exclude these guessed commits. May be null.<\/li><\/ul>\n\nThe Binder launch dataset identifies the source repos that were used, but doesn't give any indication of their contents. We crawled GitHub to get the actual specification files in the repos which were fed into repo2docker when preparing the notebook environments, as well as filesystem metadata of the repos. Some repos were deleted/made private at some point, and were thus skipped. This is indicated by the absence of any row for the given commit (or absence of both ref<\/code> and guessed_ref<\/code> in the events<\/code> table). The schema is as follows.<\/p>\n\nCREATE TABLE spec_files (\n ref TEXT NOT NULL PRIMARY KEY,\n ls TEXT,\n runtime BLOB,\n apt BLOB,\n conda BLOB,\n pip BLOB,\n pipfile BLOB,\n julia BLOB,\n r BLOB,\n nix BLOB,\n docker BLOB,\n setup BLOB,\n postbuild BLOB,\n start BLOB\n);<\/code>\n\nHere ref<\/code> corresponds to ref<\/code> and/or guessed_ref<\/code> from the events<\/code> table. For each repo, we collected spec files into the following fields (see the repo2docker docs for details on what these are). The records in the database are simply the verbatim file contents, with no parsing or further processing performed.<\/p>\n\nruntime<\/code>: runtime.txt<\/code><\/li>apt<\/code>: apt.txt<\/code><\/li>conda<\/code>: environment.yml<\/code><\/li>pip<\/code>: requirements.txt<\/code><\/li>pipfile<\/code>: Pipfile.lock<\/code> or Pipfile<\/code><\/li>julia<\/code>: Project.toml<\/code> or REQUIRE<\/code><\/li>r<\/code>: install.R<\/code><\/li>nix<\/code>: default.nix<\/code><\/li>docker<\/code>: Dockerfile<\/code><\/li>setup<\/code>: setup.py<\/code><\/li>postbuild<\/code>: postBuild<\/code><\/li>start<\/code>: start<\/code><\/li><\/ul>\n\nThe ls<\/code> field gives a metadata listing of the repo contents (excluding the .git<\/code> directory). This field is JSON encoded with the following structure based on JSON types:<\/p>\n\nObject: filesystem directory. Keys are file names within it. Values are the contents, which can be regular files, symlinks, or subdirectories.<\/li>String: symlink. The string value gives the link target.<\/li>Number: regular file. The number value gives the file size in bytes.<\/li><\/ul>\n\nCREATE TABLE clean_specs (\n ref TEXT NOT NULL PRIMARY KEY,\n conda_channels TEXT,\n conda_packages TEXT,\n pip_packages TEXT,\n apt_packages TEXT\n);<\/code>\n\nThe clean_specs<\/code> table provides parsed and validated specifications for some of the specification files (currently Pip, Conda, and APT packages). Each column gives either a JSON encoded list of package requirements, or null. APT packages have been validated using a regex adapted from the repo2docker source. Pip packages have been parsed and normalized using the Requirement class from the pkg_resources package of setuptools. Conda packages have been parsed and normalized using the conda.models.match_spec.MatchSpec<\/code> class included with the library form of Conda (distinct from the command line tool). Users might want to use these parsers when working with the package data, as the specifications can become fairly complex.<\/p>\n\nThe missing<\/code> table gives the repos that were not accessible, and event_logs<\/code> records which log files have already been added. These tables are used for updating the dataset and should not be of interest to users.<\/p>"]} 
    more » « less
  5. R code for Hastings, Y. D. (2022). Green Infrastructure Microbial Community Response to Simulated Pulse Precipitation Events in the Semi-Arid Western United States (Master's thesis, The University of Utah). This study was supported by a grant from the US National Science Foundation (DEB 2006308). R code for and Hastings, Y. D., et al. Green Infrastructure Microbial Community Response to Simulated Pulse Precipitation Events in the Semi-Arid Western United States. In review. Abstract: Nutrient retention in urban stormwater green infrastructure (SGI) of water-limited biomes is not well quantified, especially when stormwater inputs are scarce. We examined the role of plant diversity and physiochemistry as drivers of microbial community physiology and soil N pools and fluxes in bioswales subjected to simulated precipitation and a montane meadow experiencing natural rainfall within a semi-arid region during drought. Precipitation generally elevated soil moisture and pH, stimulated ecoenzyme activity, and increased the concentration of organic matter, proteins, and N pools in both bioswale and meadow soils; but the magnitude of change differed between events. Microbial community growth was static and N assimilation into biomass was limited across precipitation events. Unvegetated SGI plots had greater soil moisture, yet effects of plant diversity treatments on microbial C:N ratios, organic matter content, and N pools were inconsistent. Differences in soil N concentrations in bioswales and the meadow were most directly correlated to changes in organic matter content mediated by ecoenzyme expression and the balance of C, N, and P resources available to microbial communities. Our results add to growing evidence that ecological function of SGI is comparable to neighboring natural vegetated systems, particularly when soil media and water availability are similar. The file and R code structure is as follows: Data - Contains all data used for the analysis Results - Contains all figures, RMANOVA, and Piecewise Structural Equation Modeling results. renv - R environment used for project EEA_Vector_Analysis.R - R code used to analyze coenzyme (EEA) responses, including RMANOVA to look for significant differences in EEA response to simulated pulse events and Vector Analysis to determine the nutrient resource acquisition. Gravimetric_soil_moisture_pH.R - R code used for RMANOVA of gravimetric soil moisture and pH responses to simulated pulse events. MicrobialBiomass_EEA.Rproj - Downloaded R project Microbial_biomass.R - R code used for RMANOVA of microbial biomass carbon, nitrogen, and C:N responses to simulated pulse events. OM_protien_N_pools_fluxes.R - R code used for RMANOVA of organic matter content, proteins, and N pools and fluxes responses to simulated pulse events. PSEM_final.R - R code used for Pearson Correlation and Piecewise Structural Equation Modeling. Rclimate.R - R code used to obtain summary statistics of climate data from GIRF and TM climate and soil sensors. 
    more » « less