skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Friday, December 13 until 2:00 AM ET on Saturday, December 14 due to maintenance. We apologize for the inconvenience.


Title: COAWST model simulations of warm and cool nearshore rip-current plumes

This archive contains COAWST model input, grids and initial conditions, and output used to produce the results in a submitted manuscript. The files are:

model_input.zip: input files for simulations presented in this paper
  ocean_rip_current.in: ROMS ocean model input file
  swan_rip_current.in: SWAN wave model input file (example with Hs=1m)
  coupling_rip_current.in: model coupling file
  rip_current.h: model header file
  
model_grids_forcing.zip: bathymetry and initial condition files
     hbeach_grid_isbathy_2m.nc: ROMS bathymetry input file
     hbeach_grid_isbathy_2m.bot: SWAN bathymetry input file
     hbeach_grid_isbathy_2m.grd: SWAN grid input file
     hbeach_init_isbathy_14_18_17.nc: Initial temperature, cool surf zone dT=-1C case
     hbeach_init_isbathy_14_18_19.nc: Initial temperature, warm surf zone dT=+1C case
     hbeach_init_isbathy_14_18_16.nc: Initial temperature, cool surf zone dT=-2C case
     hbeach_init_isbathy_14_18_20.nc: Initial temperature, warm surf zone dT=+2C case
     hbeach_init_isbathy_14_18_17p5.nc: Initial temperature, cool surf zone dT=-0.5C case
     hbeach_init_isbathy_14_18_18p5.nc: Initial temperature, warm surf zone dT=+0.5C case

model_output files: model output used to produce the figures
     netcdf files, zipped
     variables included:
          x_rho (cross-shore coordinate, m)
          y_rho (alongshore coordinate, m)
          z_rho (vertical coordinate, m)
          ocean_time (time since initialization, s, output every 5 mins)
          h (bathymetry, m)
          temp (temperature, Celsius)
          dye_02 (surfzone-released dye)
          Hwave (wave height, m)
          Dissip_break (wave dissipation W/m2) 
          ubar (cross-shore depth-average velocity, m/s, interpolated to rho-points)
     Case_141817.nc: cool surf zone dT=-1C Hs=1m
     Case_141819.nc: warm surf zone dT=+1C Hs=1m
     Case_141816.nc: cool surf zone dT=-2C Hs=1m
     Case_141820.nc: warm surf zone dT=-2C Hs=1m
     Case_141817p5.nc: cool surf zone dT=-0.5C Hs=1m
     Case_141818p5.nc: warm surf zone dT=+0.5C Hs=1m
     Case_141817_Hp5.nc: cool surf zone dT=-1C Hs=0.5m
     Case_141819_Hp5.nc: warm surf zone dT=+1C Hs=0.5m
     Case_141817_Hp75.nc: cool surf zone dT=-1C Hs=0.75m
     Case_141819_Hp75.nc: warm surf zone dT=+1C Hs=0.75m

COAWST is an open source code and can be download at https://coawstmodel-trac.sourcerepo.com/coawstmodel_COAWST/. Descriptions of the input and output files can be found in the manual distributed with the model code and in the glossary at the end of the ocean.in file.

Corresponding author: Melissa Moulton, mmoulton@uw.edu

 
more » « less
Award ID(s):
2048303
PAR ID:
10330765
Author(s) / Creator(s):
Publisher / Repository:
Zenodo
Date Published:
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This data set for the manuscript entitled "Design of Peptides that Fold and Self-Assemble on Graphite" includes all files needed to run and analyze the simulations described in the this manuscript in the molecular dynamics software NAMD, as well as the output of the simulations. The files are organized into directories corresponding to the figures of the main text and supporting information. They include molecular model structure files (NAMD psf or Amber prmtop format), force field parameter files (in CHARMM format), initial atomic coordinates (pdb format), NAMD configuration files, Colvars configuration files, NAMD log files, and NAMD output including restart files (in binary NAMD format) and trajectories in dcd format (downsampled to 10 ns per frame). Analysis is controlled by shell scripts (Bash-compatible) that call VMD Tcl scripts or python scripts. These scripts and their output are also included.

    Version: 2.0

    Changes versus version 1.0 are the addition of the free energy of folding, adsorption, and pairing calculations (Sim_Figure-7) and shifting of the figure numbers to accommodate this addition.


    Conventions Used in These Files
    ===============================

    Structure Files
    ----------------
    - graph_*.psf or sol_*.psf (original NAMD (XPLOR?) format psf file including atom details (type, charge, mass), as well as definitions of bonds, angles, dihedrals, and impropers for each dipeptide.)

    - graph_*.pdb or sol_*.pdb (initial coordinates before equilibration)
    - repart_*.psf (same as the above psf files, but the masses of non-water hydrogen atoms have been repartitioned by VMD script repartitionMass.tcl)
    - freeTop_*.pdb (same as the above pdb files, but the carbons of the lower graphene layer have been placed at a single z value and marked for restraints in NAMD)
    - amber_*.prmtop (combined topology and parameter files for Amber force field simulations)
    - repart_amber_*.prmtop (same as the above prmtop files, but the masses of non-water hydrogen atoms have been repartitioned by ParmEd)

    Force Field Parameters
    ----------------------
    CHARMM format parameter files:
    - par_all36m_prot.prm (CHARMM36m FF for proteins)
    - par_all36_cgenff_no_nbfix.prm (CGenFF v4.4 for graphene) The NBFIX parameters are commented out since they are only needed for aromatic halogens and we use only the CG2R61 type for graphene.
    - toppar_water_ions_prot_cgenff.str (CHARMM water and ions with NBFIX parameters needed for protein and CGenFF included and others commented out)

    Template NAMD Configuration Files
    ---------------------------------
    These contain the most commonly used simulation parameters. They are called by the other NAMD configuration files (which are in the namd/ subdirectory):
    - template_min.namd (minimization)
    - template_eq.namd (NPT equilibration with lower graphene fixed)
    - template_abf.namd (for adaptive biasing force)

    Minimization
    -------------
    - namd/min_*.0.namd

    Equilibration
    -------------
    - namd/eq_*.0.namd

    Adaptive biasing force calculations
    -----------------------------------
    - namd/eabfZRest7_graph_chp1404.0.namd
    - namd/eabfZRest7_graph_chp1404.1.namd (continuation of eabfZRest7_graph_chp1404.0.namd)

    Log Files
    ---------
    For each NAMD configuration file given in the last two sections, there is a log file with the same prefix, which gives the text output of NAMD. For instance, the output of namd/eabfZRest7_graph_chp1404.0.namd is eabfZRest7_graph_chp1404.0.log.

    Simulation Output
    -----------------
    The simulation output files (which match the names of the NAMD configuration files) are in the output/ directory. Files with the extensions .coor, .vel, and .xsc are coordinates in NAMD binary format, velocities in NAMD binary format, and extended system information (including cell size) in text format. Files with the extension .dcd give the trajectory of the atomic coorinates over time (and also include system cell information). Due to storage limitations, large DCD files have been omitted or replaced with new DCD files having the prefix stride50_ including only every 50 frames. The time between frames in these files is 50 * 50000 steps/frame * 4 fs/step = 10 ns. The system cell trajectory is also included for the NPT runs are output/eq_*.xst.

    Scripts
    -------
    Files with the .sh extension can be found throughout. These usually provide the highest level control for submission of simulations and analysis. Look to these as a guide to what is happening. If there are scripts with step1_*.sh and step2_*.sh, they are intended to be run in order, with step1_*.sh first.


    CONTENTS
    ========

    The directory contents are as follows. The directories Sim_Figure-1 and Sim_Figure-8 include README.txt files that describe the files and naming conventions used throughout this data set.

    Sim_Figure-1: Simulations of N-acetylated C-amidated amino acids (Ac-X-NHMe) at the graphite–water interface.

    Sim_Figure-2: Simulations of different peptide designs (including acyclic, disulfide cyclized, and N-to-C cyclized) at the graphite–water interface.

    Sim_Figure-3: MM-GBSA calculations of different peptide sequences for a folded conformation and 5 misfolded/unfolded conformations.

    Sim_Figure-4: Simulation of four peptide molecules with the sequence cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) at the graphite–water interface at 370 K.

    Sim_Figure-5: Simulation of four peptide molecules with the sequence cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) at the graphite–water interface at 295 K.

    Sim_Figure-5_replica: Temperature replica exchange molecular dynamics simulations for the peptide cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) with 20 replicas for temperatures from 295 to 454 K.

    Sim_Figure-6: Simulation of the peptide molecule cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) in free solution (no graphite).

    Sim_Figure-7: Free energy calculations for folding, adsorption, and pairing for the peptide CHP1404 (sequence: cyc(GTGSGTG-GPGG-GCGTGTG-SGPG)). For folding, we calculate the PMF as function of RMSD by replica-exchange umbrella sampling (in the subdirectory Folding_CHP1404_Graphene/). We make the same calculation in solution, which required 3 seperate replica-exchange umbrella sampling calculations (in the subdirectory Folding_CHP1404_Solution/). Both PMF of RMSD calculations for the scrambled peptide are in Folding_scram1404/. For adsorption, calculation of the PMF for the orientational restraints and the calculation of the PMF along z (the distance between the graphene sheet and the center of mass of the peptide) are in Adsorption_CHP1404/ and Adsorption_scram1404/. The actual calculation of the free energy is done by a shell script ("doRestraintEnergyError.sh") in the 1_free_energy/ subsubdirectory. Processing of the PMFs must be done first in the 0_pmf/ subsubdirectory. Finally, files for free energy calculations of pair formation for CHP1404 are found in the Pair/ subdirectory.

    Sim_Figure-8: Simulation of four peptide molecules with the sequence cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) where the peptides are far above the graphene–water interface in the initial configuration.

    Sim_Figure-9: Two replicates of a simulation of nine peptide molecules with the sequence cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) at the graphite–water interface at 370 K.

    Sim_Figure-9_scrambled: Two replicates of a simulation of nine peptide molecules with the control sequence cyc(GGTPTTGGGGGGSGGPSGTGGC) at the graphite–water interface at 370 K.

    Sim_Figure-10: Adaptive biasing for calculation of the free energy of the folded peptide as a function of the angle between its long axis and the zigzag directions of the underlying graphene sheet.

     

    This material is based upon work supported by the US National Science Foundation under grant no. DMR-1945589. A majority of the computing for this project was performed on the Beocat Research Cluster at Kansas State University, which is funded in part by NSF grants CHE-1726332, CNS-1006860, EPS-1006860, and EPS-0919443. This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1548562, through allocation BIO200030. 
    more » « less
  2. The modern configuration of the South East Asian Islands (SEAI) evolved over the last fifteen million years, as a result of subduction, arc magmatism, and arc-continent collisions, contributing to both increased land area and high topography.  The presence of the additional land area has been postulated to enhance convective rainfall, facilitating both increased silicate weathering and the development of the modern-day Walker circulation.  Using an Earth System Model in conjunction with a climate-silicate weathering model, we argue instead for a significant role of SEAI topography for both effects.  This dataset archives model output used in this investigation, including simulations using the Community Earth System Model version 1.2, and the climate-silicate weathering model GEOCLIM. All data are in Netcdf format, and were generated either by the Community Earth System Model 1.2 (Hurrell et al. 2013) or the climate-silicate weathering model GEOCLIM (Park et al. 2020).  Model output is organized into 4 tar files: 1) B1850C5.tar Contains model output for the fully coupled CESM1.2 runs, for 2D fields and for 3D pressure vertical velocity (W) between 10S-10N.  Monthly mean data for years 41-110 of the simulations.   Naming convention is No SEAI topography: B1850C5_noSEAItopo_y41-110.nc and B1850C5_noSEAItopo_W_y41-110.nc 50% SEAI topography: B1850C5_0.5SEAItopo_y41-110.nc and B1850C5_0.5SEAItopo_W_y41-110.nc 100% SEAI topography: B1850C5_y41-110.nc and B1850C5_W_y41-110.nc 150% SEAO topogaphy: B1850C5_1.5SEAItopo_y41-110.nc and B1850C5_1.5SEAItopo_W_y41-110.nc 2) E1850C5.tar Contains model output for the slab ocean CESM1.2 runs, for 2D fields and for 3D pressure vertical velocity (W) between 10S-10N.  Monthly mean data for years 21-50 of the simulations.  Naming convention is No SEAI topography: E1850C5_noSEAItopo_y21-50.nc and E1850C5_noSEAItopo_W_y21-50.nc 50% SEAI topography: E1850C5_0.5SEAItopo_y21-50.nc and E1850C5_0.5SEAItopo_W_y21-50.nc 100% SEAI topography: E1850C5_y21-50.nc and E1850C5_W_y21-50.nc 150% SEAO topogaphy:  E1850C5_1.5SEAItopo_y21-50.nc and E1850C5_1.5SEAItopo_W_y21-50.nc 3) GEOCLIM.tar Contains model output from the climate-silicate weathering model GEOCLIM.  Data is provided for all 573 parameter combinations.  All values are climatological annual means. All files contain these variables: GMST: global mean surface temperature (in K) atm_CO2_level: atmospheric pCO2 (in ppm) degassing: globally-integrated CO2 flux (in mol/yr) The files ending with 1xCO2.nc also contain these spatial fields: lithology fraction: fraction of land covered by a lithology class erosion: Regolith erosion rate (m/yr) weathering: Ca-Mg weathering rate (mol/m^2/yr) E1850C5_1xCO2.nc - GEOCLIM output using the Modern SEAI simulation as input, and for CO2 fixed to 286.7ppm.  E1850C5_noSEAI_1xCO2.nc - GEOCLIM output using the no SEAI simulation as input, and for CO2 fixed to 286.7ppm.  E1850C5_noSEAItopo_1xCO2.nc - GEOCLIM output using the flat SEAI simulation as input, and for CO2 fixed to 286.7ppm.  E1850C5_noSEAI_equil.nc - GEOCLIM output using the no SEAI simulation as input, and CO2 adjusted so that system is in carbon flux equilibrium.   E1850C5_noSEAItopo_flatSEAIslope_equil.nc - GEOCLIM output using the flat SEAI simulation as input, and CO2 adjusted so that system is in carbon flux equilibrium.   4) Surface.tar Contains land fraction and surface geopotential fields for the modern SEAI (Landfrac.nc) and no SEAI (Landfrac_noSEAI.nc) simulations References Hurrell, J.W., Holland, M.M., Gent, P.R., Ghan, S., Kay, J.E., Kushner, P.J., Lamarque, J.F., Large, W.G., Lawrence, D., Lindsay, K. and Lipscomb, W.H., 2013. The community earth system model: a framework for collaborative research. Bulletin of the American Meteorological Society, 94(9), pp.1339-1360. Park, Y., Maffre, P., Goddéris, Y., Macdonald, F.A., Anttila, E.S. and Swanson-Hysell, N.L., 2020. Emergence of the Southeast Asian islands as a driver for Neogene cooling. Proceedings of the National Academy of Sciences, 117(41), pp.25319-25326. 
    more » « less
  3. Data files were used in support of the research paper titled "“Experimentation Framework for Wireless
    Communication Systems under Jamming Scenarios" which has been submitted to the IET Cyber-Physical Systems: Theory & Applications journal. 

    Authors: Marko Jacovic, Michael J. Liston, Vasil Pano, Geoffrey Mainland, Kapil R. Dandekar
    Contact: krd26@drexel.edu

    ---------------------------------------------------------------------------------------------

    Top-level directories correspond to the case studies discussed in the paper. Each includes the sub-directories: logs, parsers, rayTracingEmulation, results. 

    --------------------------------

    logs:    - data logs collected from devices under test
        - 'defenseInfrastucture' contains console output from a WARP 802.11 reference design network. Filename structure follows '*x*dB_*y*.txt' in which *x* is the reactive jamming power level and *y* is the jaming duration in samples (100k samples = 1 ms). 'noJammer.txt' does not include the jammer and is a base-line case. 'outMedian.txt' contains the median statistics for log files collected prior to the inclusion of the calculation in the processing script. 
        - 'uavCommunication' contains MGEN logs at each receiver for cases using omni-directional and RALA antennas with a 10 dB constant jammer and without the jammer. Omni-directional folder contains multiple repeated experiments to provide reliable results during each calculation window. RALA directories use s*N* folders in which *N* represents each antenna state. 
        - 'vehicularTechnologies' contains MGEN logs at the car receiver for different scenarios. 'rxNj_5rep.drc' does not consider jammers present, 'rx33J_5rep.drc' introduces the periodic jammer, in 'rx33jSched_5rep.drc' the device under test uses time scheduling around the periodic jammer, in 'rx33JSchedRandom_5rep.drc' the same modified time schedule is used with a random jammer. 

    --------------------------------

    parsers:    - scripts used to collect or process the log files used in the study
            - 'defenseInfrastructure' contains the 'xputFiveNodes.py' script which is used to control and log the throughput of a 5-node WARP 802.11 reference design network. Log files are manually inspected to generate results (end of log file provides a summary). 
            - 'uavCommunication' contains a 'readMe.txt' file which describes the parsing of the MGEN logs using TRPR. TRPR must be installed to run the scripts and directory locations must be updated. 
            - 'vehicularTechnologies' contains the 'mgenParser.py' script and supporting 'bfb.json' configuration file which also require TRPR to be installed and directories to be updated. 

    --------------------------------

    rayTracingEmulation:    - 'wirelessInsiteImages': images of model used in Wireless Insite
                - 'channelSummary.pdf': summary of channel statistics from ray-tracing study
                - 'rawScenario': scenario files resulting from code base directly from ray-tracing output based on configuration defined by '*WI.json' file 
                - 'processedScenario': pre-processed scenario file to be used by DYSE channel emulator based on configuration defined by '*DYSE.json' file, applies fixed attenuation measured externally by spectrum analyzer and additional transmit power per node if desired
                - DYSE scenario file format: time stamp (milli seconds), receiver ID, transmitter ID, main path gain (dB), main path phase (radians), main path delay (micro seconds), Doppler shift (Hz), multipath 1 gain (dB), multipath 1 phase (radians), multipath 1 delay relative to main path delay (micro seconds), multipath 2 gain (dB), multipath 2 phase (radians), multipath 2 delay relative to main path delay (micro seconds)
                - 'nodeMapping.txt': mapping of Wireless Insite transceivers to DYSE channel emulator physical connections required
                - 'uavCommunication' directory additionally includes 'antennaPattern' which contains the RALA pattern data for the omni-directional mode ('omni.csv') and directional state ('90.csv')

    --------------------------------

    results:    - contains performance results used in paper based on parsing of aforementioned log files
     

     
    more » « less
  4. The intended use of this archive is to facilitate meta-analysis of the Data Observation Network for Earth (DataONE, [1]). 

    DataONE is a distributed infrastructure that provides information about earth observation data. This dataset was derived from the DataONE network using Preston [2] between 17 October 2018 and 6 November 2018, resolving 335,213 urls at an average retrieval rate of about 5 seconds per url, or 720 files per hour, resulting in a data gzip compressed tar archive of 837.3 MB .  

    The archive associates 325,757 unique metadata urls [3] to 202,063 unique ecological metadata files [4]. Also, the DataONE search index was captured to establish provenance of how the dataset descriptors were found and acquired. During the creation of the snapshot (or crawl), 15,389 urls [5], or 4.7% of urls, did not successfully resolve. 

    To facilitate discovery, the record of the Preston snapshot crawl is included in the preston-ls-* files . There files are derived from the rdf/nquad file with hash://sha256/8c67e0741d1c90db54740e08d2e39d91dfd73566ea69c1f2da0d9ab9780a9a9f . This file can also be found in the data.tar.gz at data/8c/67/e0/8c67e0741d1c90db54740e08d2e39d91dfd73566ea69c1f2da0d9ab9780a9a9f/data . For more information about concepts and format, please see [2]. 

    To extract all EML files from the included Preston archive, first extract the hashes assocated with EML files using:

    cat preston-ls.tsv.gz | gunzip | grep "Version" | grep -v "deeplinker" | grep -v "query/solr" | cut -f1,3 | tr '\t' '\n' | grep "hash://" | sort | uniq > eml-hashes.txt

    extract data.tar.gz using:

    ~/preston-archive$ tar xzf data.tar.gz 

    then use Preston to extract each hash using something like:

    ~/preston-archive$ preston get hash://sha256/00002d0fc9e35a9194da7dd3d8ce25eddee40740533f5af2397d6708542b9baa
    <eml:eml xmlns:eml="eml://ecoinformatics.org/eml-2.1.1" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:stmml="http://www.xml-cml.org/schema/stmml_1.1" packageId="doi:10.18739/A24P9Q" system="https://arcticdata.io" scope="system" xsi:schemaLocation="eml://ecoinformatics.org/eml-2.1.1 ~/development/eml/eml.xsd">
      <dataset>
        <alternateIdentifier>urn:x-wmo:md:org.aoncadis.www::d76bc3b5-7b19-11e4-8526-00c0f03d5b7c</alternateIdentifier>
        <alternateIdentifier>d76bc3b5-7b19-11e4-8526-00c0f03d5b7c</alternateIdentifier>
        <title>Airglow Image Data 2011 4 of 5</title>
    ...

    Alternatively, without using Preston, you can extract the data using the naming convention:

    data/[x]/[y]/[z]/[hash]/data

    where x is the first 2 characters of the hash, y the second 2 characters, z the third 2 characters, and hash the full sha256 content hash of the EML file.

    For example, the hash hash://sha256/00002d0fc9e35a9194da7dd3d8ce25eddee40740533f5af2397d6708542b9baa can be found in the file: data/00/00/2d/00002d0fc9e35a9194da7dd3d8ce25eddee40740533f5af2397d6708542b9baa/data . For more information, see [2].

    The intended use of this archive is to facilitate meta-analysis of the DataONE dataset network. 

    [1] DataONE, https://www.dataone.org
    [2] https://preston.guoda.bio, https://doi.org/10.5281/zenodo.1410543 . DataONE was crawled via Preston with "preston update -u https://dataone.org".
    [3] cat preston-ls.tsv.gz | gunzip | grep "Version" | grep -v "deeplinker" | grep -v "query/solr" | cut -f1,3 | tr '\t' '\n' | grep -v "hash://" | sort | uniq | wc -l
    [4] cat preston-ls.tsv.gz | gunzip | grep "Version" | grep -v "deeplinker" | grep -v "query/solr" | cut -f1,3 | tr '\t' '\n' | grep "hash://" | sort | uniq | wc -l
    [5] cat preston-ls.tsv.gz | gunzip | grep "Version" | grep  "deeplinker" | grep -v "query/solr" | cut -f1,3 | tr '\t' '\n' | grep -v "hash://" | sort | uniq | wc -l

    This work is funded in part by grant NSF OAC 1839201 from the National Science Foundation.

     
    more » « less
  5. A biodiversity dataset graph: DataONE

    The intended use of this archive is to facilitate meta-analysis of the Data Observation Network for Earth (DataONE, [1]). DataONE is a distributed infrastructure that provides information about earth observation data. 

    This dataset provides versioned snapshots of the DataONE network as tracked by Preston [2] between 17 October 2018 and 7 July 2019.  

    The archive consists of 256 individual parts (e.g., preston-00.tar.gz, preston-01.tar.gz, ...) to allow for parallel file downloads. The archive contains three types of files: index files, provenance files and data files. Only two index and provenance files are included and have been individually included in this dataset publication. Index files provide a way to links provenance files in time to eestablish a versioning mechanism. Provenance files describe how, when and where the DataONE meta-data files were retrieved. For more information, please visit https://preston.guoda.bio or https://doi.org/10.5281/zenodo.1410543).  

    To retrieve and verify the downloaded DataONE biodiversity dataset graph, first concatenate all the downloaded preston-*.tar.gz files (e.g., cat preston-*.tar.gz > preston.tar.gz). Then, extract the archives into a "data" folder. Alternatively, you can use the preston[2] command-line tool to "clone" this dataset using:

    $ java -jar preston.jar clone --remote https://zenodo.org/record/3277312/files

    After that, verify the index of the archive by reproducing the following result:

    $ java -jar preston.jar history
    <0659a54f-b713-4f86-a917-5be166a14110> <http://purl.org/pav/hasVersion> <hash://sha256/8c67e0741d1c90db54740e08d2e39d91dfd73566ea69c1f2da0d9ab9780a9a9f> .
    <hash://sha256/3ed3acaca7ac57f546d0b8877c1927ab5e08c23eccaa8219600c59c77a72c685> <http://purl.org/pav/previousVersion> <hash://sha256/8c67e0741d1c90db54740e08d2e39d91dfd73566ea69c1f2da0d9ab9780a9a9f> .
    <hash://sha256/857753997a7595a1b372b05641b58a25d9408b7ff08d557ce1fe8b73e4bd383f> <http://purl.org/pav/previousVersion> <hash://sha256/3ed3acaca7ac57f546d0b8877c1927ab5e08c23eccaa8219600c59c77a72c685> .
    <hash://sha256/7ee0376f4c3f7aeeda36927a5211395e5da8201e810e8c7e638a0fe23d001e88> <http://purl.org/pav/previousVersion> <hash://sha256/857753997a7595a1b372b05641b58a25d9408b7ff08d557ce1fe8b73e4bd383f> .
    <hash://sha256/68b4974d8ab7c4c7a7a4305065839b60ba460aaa862590b34c67877738feba90> <http://purl.org/pav/previousVersion> <hash://sha256/7ee0376f4c3f7aeeda36927a5211395e5da8201e810e8c7e638a0fe23d001e88> .
    <hash://sha256/060a76d56255bf9482c951748c91291fddeeb20f180632132be1344e081b2372> <http://purl.org/pav/previousVersion> <hash://sha256/68b4974d8ab7c4c7a7a4305065839b60ba460aaa862590b34c67877738feba90> .
    <hash://sha256/29357bdfab4548025f8a5743301f5c3c9146fa436c39e3c9e019fb9409ac9c42> <http://purl.org/pav/previousVersion> <hash://sha256/060a76d56255bf9482c951748c91291fddeeb20f180632132be1344e081b2372> .
    <hash://sha256/3669cd95100d1d533eb8953ff4ec5092cbd8addb8879b3e6262191148a8a3ebb> <http://purl.org/pav/previousVersion> <hash://sha256/29357bdfab4548025f8a5743301f5c3c9146fa436c39e3c9e019fb9409ac9c42> .
    <hash://sha256/8dc1663299359d271cb1b4c14ad521d0f1be67743689dd18016543dc1e097efb> <http://purl.org/pav/previousVersion> <hash://sha256/3669cd95100d1d533eb8953ff4ec5092cbd8addb8879b3e6262191148a8a3ebb> .
    <hash://sha256/dc4903e8afee651db1d9bf509f20503bf9c8e89679c4bcffb46d5b97440cb6de> <http://purl.org/pav/previousVersion> <hash://sha256/8dc1663299359d271cb1b4c14ad521d0f1be67743689dd18016543dc1e097efb> .

    To check the integrity of the extracted archive, confirm that each line produce by the command "preston verify" produces lines as shown below, with each line including "CONTENT_PRESENT_VALID_HASH". Depending on hardware capacity, this may take a while.

    $ java -jar preston.jar verify
    hash://sha256/e55c1034d985740926564e94decd6dc7a70f779a33e7deb931553739cda16945    file:/home/preston/preston-dataone/data/e5/5c/e55c1034d985740926564e94decd6dc7a70f779a33e7deb931553739cda16945    OK    CONTENT_PRESENT_VALID_HASH    21580
    hash://sha256/d0ddcc2111b6134a570bcc7d89375920ef4d754130cecc0727c79d2b05a9f81f    file:/home/preston/preston-dataone/data/d0/dd/d0ddcc2111b6134a570bcc7d89375920ef4d754130cecc0727c79d2b05a9f81f    OK    CONTENT_PRESENT_VALID_HASH    2035
    hash://sha256/472de9d1c9fd7e044aac409abfbfff9f12c6b69359df995d431009580ffb0f53    file:/home/preston/preston-dataone/data/47/2d/472de9d1c9fd7e044aac409abfbfff9f12c6b69359df995d431009580ffb0f53    OK    CONTENT_PRESENT_VALID_HASH    1935
    hash://sha256/b29879462cd43862129c5cf9b149c41ecd33ffef284a4dbea4ac1c0f90108687    file:/home/preston/preston-dataone/data/b2/98/b29879462cd43862129c5cf9b149c41ecd33ffef284a4dbea4ac1c0f90108687    OK    CONTENT_PRESENT_VALID_HASH    1553

    Note that a copy of the java program "preston", preston.jar, is included in this publication. The program runs on java 8+ virtual machine using "java -jar preston.jar", or in short "preston". 

    Files in this data publication:

    README - this file

    preston.jar - executable java jar containing preston[2] v0.1.1.

    preston-[00-ff].tar.gz - preston archives containing DataONE meta-data files, their provenance and a provenance index.

    2a5de79372318317a382ea9a2cef069780b852b01210ef59e06b640a3539cb5a - preston index file
    2aecaf289def0e23a27058bf7715f226ef9189905f0be13228174825633125cf - preston index file
    3d38b70198e448674be6a63d14b9817f3a956f48bba7418fa7baa086a56c05b7 - preston index file
    66ad3e5e904740f1e835ac6718dda4279e0c24b204ea0d1113cda1352a5072ba - preston index file
    8bf062872ce958545d361e9d53a552ffb025ac29ab875caad1157c0995d34f66 - preston index file
    d9378616636be3686bbabd5bf29d50f0ef0e5ceb5ddd7dfce47f7e755b596b7d - preston index file
    da26fa6e7371385ed3f61af9a766221c833060d59dfd4869bbd7110f95f288db - preston index file
    e4103a75627857de3ee2e317429108611c244fc448c01d1d7bf652115c3b8a55 - preston index file
    eb368fedb8f100210dd968edcf80f4d13cab3dd64135a6ab744102cf15e68c94 - preston index file
    ff92b6c06ae5286bd2f1db679e0fcc4da294acb9bc01b2e9522378d99218c2e3 - preston index file

    [1] DataONE, https://www.dataone.org
    [2] https://preston.guoda.bio, https://doi.org/10.5281/zenodo.1410543 . DataONE was crawled via Preston with "preston update -u https://dataone.org".

    This work is funded in part by grant NSF OAC 1839201 from the National Science Foundation

     
    more » « less