skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Plant neighborhood shapes diversity and reduces interspecific variation of the phyllosphere microbiome. Meyer et. al 2022. ISME-J.
This is the data archive for: Meyer et al. 2022. Plant neighborhood shapes diversity and reduces interspecific variation of the phyllosphere microbiome. ISME-J. Please cite this article when using these archived data.</div>DOI: 10.1038/s41396-021-01184-6</div></div>Included are raw genetic sequences of the V5-V7 region of the 16S rRNA gene derived from experimental leaf surfaces of tomato, pepper, and bean plants.</div></div>Included in this archive are:</div>Raw sequence data (RawFASTQ.zip)</div>Reproducible R scripts (MeyerEtAl2021_RScript.R, VarPartSupplement.R)</div>R objects corresponding to archived scripts (.RDS)</div>Data for generating certain plots (PermanovaRValues.txt, PermanovaValuesByHost.txt, NeutralModelRValuesByHarvest.txt, VarPartHostEffects.txt)</div>Sample metadata (NeighborhoodMetaData.txt)</div>Phylogenetic Tree file for sample ASVs (PhyloTree.tre)</div>Geographic distance matrix for distances between plots (GeodistNeighborhood.txt)</div>ddPCR (microbial abundance) data (ddPCR_Neighborhood.csv)</div>R script for rarefication function (Rarefy_mean.R)</div>Taxonomic assignments for all ASVs in study (Taxonomy_Neighborhood.txt)</div>R image files to load R environment instead of running script (MeyerEtAl2021_RScript.RData, VarPartSupplement.RData)</div></div></div></div>  more » « less
Award ID(s):
1754494
PAR ID:
10331763
Author(s) / Creator(s):
Publisher / Repository:
figshare
Date Published:
Subject(s) / Keyword(s):
60504 Microbial Ecology
Format(s):
Medium: X Size: 3018236214 Bytes
Size(s):
3018236214 Bytes
Sponsoring Org:
National Science Foundation
More Like this
  1. This repository contains our raw datasets from channel measurements performed at the University of Utah campus. In addition, we have included a document that explains the setup and methodology used to collect this data, as well as a very brief discussion of results.  File organization: * documentation/ - Contains a .docx with the description of the setup and evaluation. * data/ - HDF5 files containing both metadata and raw IQ samples for each location at which data was collected. Notice we collected data at 14  different client locations. See map in the attached docx (skipped locations 12 and 16). We deployed 5 different receivers at 5 different rooftops. Due to resource constraints, one set of files contains data from 4 different locations whereas another set  contains information from the single remaining location. We have developed a set of python scripts that allow us to parse and analyze the data. Although not included here, they can be found in our public repository: https://github.com/renew-wireless/RENEWLab You can find the top script here.</p> For more information on the POWDER-RENEW project please visit the POWDER website. The RENEW part of the project focuses on the deployment of an open-source massive MIMO system. Please visit our website for more information.</p> 
    more » « less
  2. null (Ed.)
    Datasets are often derived by manipulating raw data with statistical software packages. The derivation of a dataset must be recorded in terms of both the raw input and the manipulations applied to it. Statistics packages typically provide limited help in documenting provenance for the resulting derived data. At best, the operations performed by the statistical package are described in a script. Disparate representations make these scripts hard to understand for users. To address these challenges, we created Continuous Capture of Metadata (C2Metadata), a system to capture data transformations in scripts for statistical packages and represent it as metadata in a standard format that is easy to understand. We do so by devising a Structured Data Transformation Algebra (SDTA), which uses a small set of algebraic operators to express a large fraction of data manipulation performed in practice. We then implement SDTA, inspired by relational algebra, in a data transformation specification language we call SDTL. In this demonstration, we showcase C2Metadata’s capture of data transformations from a pool of sample transformation scripts in at least two languages: SPSS®and Stata®(SAS®and R are under development), for social science data in a large academic repository. We will allow the audience to explore C2Metadata using a web-based interface, visualize the intermediate steps and trace the provenance and changes of data at different levels for better understanding of the process. 
    more » « less
  3. This data set for the manuscript entitled "Computational Design of a Cyclic Peptide that Inhibits the CTLA4 Immune Checkpoint Pathway" includes all files needed to run and analyze the simulations of a designed cyclic peptide (Peptide 16) bound to CTLA4 in the putative most stable binding configuration, which is detailed in Figure 6 of the paper. These files include molecular model structure files (NAMD psf), force field parameter files (in CHARMM format), initial atomic coordinates (pdb format), NAMD configuration files, NAMD output including restart files (in binary NAMD format) and trajectories in dcd format (downsampled to 10 ns per frame). Analysis is controlled by shell scripts (Bash-compatible) that call VMD Tcl scripts. These scripts and their output are also included. Version: 1.0 Conventions Used in These Files =============================== Structure Files ---------------- - ctla4_P16_wat.psf (original NAMD (XPLOR?) format psf file including atom details (type, charge, mass), as well as definitions of bonds, angles, dihedrals, and impropers for each dipeptide.) - ctla4_P16.pdb (initial coordinates before equilibration) - repart_*.psf (same as the above psf files, but the masses of non-water hydrogen atoms have been repartitioned by VMD script repartitionMass.tcl) - rest*.pdb (same as the above pdb files, but atoms have been marked for restraints in NAMD. These files are generated by doPrep.sh, with restraints applied to different atoms.) Force Field Parameters ---------------------- CHARMM format parameter files: - par_all36m_prot.prm (CHARMM36m FF for proteins) - toppar_water_ions_prot.str (CHARMM water and ions with NBFIX parameters needed for protein and others commented out) Template NAMD Configuration Files --------------------------------- These contain the most commonly used simulation parameters. They are called by the other NAMD configuration files (which are in the namd/ subdirectory): - template_min.namd (minimization) - template_rest.namd (NPT equilibration with different parts of the protein restrained) - template_prod.namd (for the long production simulations) Minimization ------------- - namd/min_*.0.namd Restraints ------------- - namd/rest_*.0.namd (both CTLA4 binding site and peptide atoms are restrained) - namd/rest_*.1.namd (CA atoms of CTLA4 and all atoms of the peptide are restrained) - namd/rest_*.2.namd (all atoms of only the peptide are restrained) - namd/rest_*.3.namd (only CA atoms of only the peptide are restrained) - namd/rest_*.4.namd (no atoms are restrained) Production ------------- - namd/pro_*.{D,E,F}.0.namd Analysis ------------- - interaction.sh (Shell script for running analysis with VMD) - calcSeparationNearestAtom.tcl (Calculate the separation between two selections, taking the shortest distance between any pair of atoms spanning the two selections. Accounts for (orthogonal) periodic boundary conditions.) - useful.tcl (VMD Tcl script with a library of useful procs, used by the script above) - sep_*.dat (Output of the above analysis containing rows with two columns: time in nanoseconds and minimum distance in Å) Scripts ------- Files with the .sh extension can be found throughout. These usually provide the highest level control for submission of simulations and analysis. Look to these as a guide to what is happening. 
    more » « less
  4. IceCube is a cubic kilometer neutrino detector located at the South Pole. It generates 1 TiB of raw data per day, which must be archived for possible retrieval years or decades later. Other low-level data products are also archived for easy retrieval in the event of a catastrophic data center failure. The Long Term Archive software is IceCube's answer to archiving this data across several computing sites. 
    more » « less
  5. This dataset contains raw data, processed data, and the codes used for data processing in our manuscript from our Fourier-transform infrared (FTIR) spectroscopy, Nuclear magnetic resonance (NMR), Raman spectroscopy, and X-ray diffraction (XRD) experiments. The data and codes for the fits of our unpolarized Raman spectra to polypeptide spectra is also included. The following explains the folder structure of the data provided in this dataset, which is also explained in the file ReadMe.txt. Browsing the data in Tree view is recommended. Folder contents Codes Raman Data Processing: The MATLAB script file RamanDecomposition.m contains the code to decompose the sub-peaks across different polarized Raman spectra (XX, XZ, ZX, ZZ, and YY), considering a set of pre-determined restrictions. The helper functions used in RamanDecomposition.m are included in the Helpers folder. RamanDecomposition.pdf is a PDF printout of the MATLAB code and output. P Value Simulation: 31_helix.ipynb and a_helix.ipynb: These two Jupyter Notebook files contain the intrinsic P value simulation for the 31-helix and alpha-helix structures. The simulation results were used to prepare Supplementary Table 4. See more details in the comments contained. Vector.py, Atom.py, Amino.py, and Helpers.py: These python files contains the class definitions used in 31_helix.ipynb and a_helix.ipynb. See more details in the comments contained. FTIR FTIR Raw Transmission.opj: This Origin data file contains the raw transmission data measured on single silk strand and used for FTIR spectra analysis. FTIR Deconvoluted Oscillators.opj: This Origin data file was generated from the data contained in the previous file using W-VASE software from J. A. Woollam, Inc. FTIR Unpolarized MultiStrand Raw Transmission.opj: This Origin data file contains the raw transmission data measured on multiple silk strands. The datasets contained in the first two files above were used to plot Figure 2a-b and the FTIR data points in Figure 4a, and Supplementary Figure 6. The datasets contained in the third file above were used to plot Supplementary Figure 3a. The datasets contained in the first two files above were used to plot Figure 2a-b, FTIR data points in Figure 4a, and Supplementary Figure 6. NMR Raw data files of the 13C MAS NMR spectra: ascii-spec_CP.txt: cross-polarized spectrum ascii-spec_DP.txt: direct-polarized spectrum Data is in ASCII format (comma separated values) using the following columns: Data point number Intensity Frequency [Hz] Frequency [ppm] Polypeptide Spectrum Fits MATLAB scripts (.m files) and Helpers: The MATLAB script file Raman_Fitting_Process_Part_1.m and Raman_Fitting_Process_Part_2.m contains the step-by-step instructions to perform the fitting process of our calculated unpolarized Raman spectrum, using digitized model polypeptide Raman spectra. The Helper folder contains two helper functions used by the above scripts. See the scripts for further instruction and information. Data aPA.csv, bPA.csv, GlyI.csv, GlyII.csv files: These csv files contain the digitized Raman spectra of poly-alanine, beta-alanine, poly-glycine-I, and poly-glycine-II. Raman_Exp_Data.mat: This MATLAB data file contains the processed, polarized Raman spectra obtained from our experiments. Variable freq is the wavenumber information of each collected spectrum. The variables xx, yy, zz, xz, zx represent the polarized Raman spectra collected. These variables are used to calculate the unpolarized Raman spectrum in Raman_Fitting_Process_Part_2.m. See the scripts for further instruction and information. Raman Raman Raw Data.mat: This MATLAB data file contains all the raw data used for Raman spectra analysis. All variables are of MATLAB structure data type. Each variable has fields called Freq and Raw, with Freq contains the wavenumber information of the measured spectra and Raw contains 5 measured Raman signal strengths. Variable XX, XZ, ZX, ZZ, and YY were used to plot and sub-peak analysis for Figure 2c-d, Raman data points in Figure 4a, Figure 5b, Supplementary Figure 2, and Supplementary Figure 7. Variable WideRange was used to plot and identify the peaks for Supplementary Figure 3b. X-Ray X-Ray.mat: This MATLAB data file contains the raw X-ray data used for the diffraction analysis in Supplementary Figure 5. 
    more » « less