{"Abstract":["The intended use of this archive is to facilitate meta-analysis of the Data Observation Network for Earth (DataONE, [1]). <\/p>\n\nDataONE is a distributed infrastructure that provides information about earth observation data. This dataset was derived from the DataONE network using Preston [2] between 17 October 2018 and 6 November 2018, resolving 335,213 urls at an average retrieval rate of about 5 seconds per url, or 720 files per hour, resulting in a data gzip compressed tar archive of 837.3 MB . <\/p>\n\nThe archive associates 325,757 unique metadata urls [3] to 202,063 unique ecological metadata files [4]. Also, the DataONE search index was captured to establish provenance of how the dataset descriptors were found and acquired. During the creation of the snapshot (or crawl), 15,389 urls [5], or 4.7% of urls, did not successfully resolve. <\/p>\n\nTo facilitate discovery, the record of the Preston snapshot crawl is included in the preston-ls-* files . There files are derived from the rdf/nquad file with hash://sha256/8c67e0741d1c90db54740e08d2e39d91dfd73566ea69c1f2da0d9ab9780a9a9f . This file can also be found in the data.tar.gz at data/8c/67/e0/8c67e0741d1c90db54740e08d2e39d91dfd73566ea69c1f2da0d9ab9780a9a9f/data . For more information about concepts and format, please see [2]. <\/p>\n\nTo extract all EML files from the included Preston archive, first extract the hashes assocated with EML files using:<\/p>\n\ncat preston-ls.tsv.gz | gunzip | grep "Version" | grep -v "deeplinker" | grep -v "query/solr" | cut -f1,3 | tr '\\t' '\\n' | grep "hash://" | sort | uniq > eml-hashes.txt<\/p>\n\nextract data.tar.gz using:<\/p>\n\n~/preston-archive$$ tar xzf data.tar.gz <\/p>\n\nthen use Preston to extract each hash using something like:<\/p>\n\n~/preston-archive$$ preston get hash://sha256/00002d0fc9e35a9194da7dd3d8ce25eddee40740533f5af2397d6708542b9baa\n<eml:eml xmlns:eml="eml://ecoinformatics.org/eml-2.1.1" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:stmml="http://www.xml-cml.org/schema/stmml_1.1" packageId="doi:10.18739/A24P9Q" system="https://arcticdata.io" scope="system" xsi:schemaLocation="eml://ecoinformatics.org/eml-2.1.1 ~/development/eml/eml.xsd">\n <dataset>\n <alternateIdentifier>urn:x-wmo:md:org.aoncadis.www::d76bc3b5-7b19-11e4-8526-00c0f03d5b7c</alternateIdentifier>\n <alternateIdentifier>d76bc3b5-7b19-11e4-8526-00c0f03d5b7c</alternateIdentifier>\n <title>Airglow Image Data 2011 4 of 5</title>\n...<\/p>\n\nAlternatively, without using Preston, you can extract the data using the naming convention:<\/p>\n\ndata/[x]/[y]/[z]/[hash]/data<\/p>\n\nwhere x is the first 2 characters of the hash, y the second 2 characters, z the third 2 characters, and hash the full sha256 content hash of the EML file.<\/p>\n\nFor example, the hash hash://sha256/00002d0fc9e35a9194da7dd3d8ce25eddee40740533f5af2397d6708542b9baa can be found in the file: data/00/00/2d/00002d0fc9e35a9194da7dd3d8ce25eddee40740533f5af2397d6708542b9baa/data . For more information, see [2].<\/p>\n\nThe intended use of this archive is to facilitate meta-analysis of the DataONE dataset network. <\/p>\n\n[1] DataONE, https://www.dataone.org\n[2] https://preston.guoda.bio, https://doi.org/10.5281/zenodo.1410543 . DataONE was crawled via Preston with "preston update -u https://dataone.org".\n[3] cat preston-ls.tsv.gz | gunzip | grep "Version" | grep -v "deeplinker" | grep -v "query/solr" | cut -f1,3 | tr '\\t' '\\n' | grep -v "hash://" | sort | uniq | wc -l\n[4] cat preston-ls.tsv.gz | gunzip | grep "Version" | grep -v "deeplinker" | grep -v "query/solr" | cut -f1,3 | tr '\\t' '\\n' | grep "hash://" | sort | uniq | wc -l\n[5] cat preston-ls.tsv.gz | gunzip | grep "Version" | grep "deeplinker" | grep -v "query/solr" | cut -f1,3 | tr '\\t' '\\n' | grep -v "hash://" | sort | uniq | wc -l<\/p>\n\nThis work is funded in part by grant NSF OAC 1839201 from the National Science Foundation.<\/p>"]}
more »
« less
Drone image-derived digital elevation models ('flight of terraces' site)
{"Abstract":["Drone image-derived digital elevation model at 'flight of terraces' site in "Microcontinent Breakup and Links to Possible Plate Boundary Reorganization in the Northern Gulf of California, México" created with Agisoft Photoscan software. WGS1984 UTM Zone 12N."]}
more »
« less
- Award ID(s):
- 1728145
- PAR ID:
- 10356808
- Publisher / Repository:
- UCLA Dataverse
- Date Published:
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Revision: This revision includes four independent trajectory values of the ensemble averages of the mean squared radii of gyration and their standard deviations, which can be used to compute statistical measures such as the standard error. This distribution provides access to 18,450 configurations of coarse-grained polymers. The data is provided as a serialized object using the `pickle' Python module and in csv format. The data was compiled using Python version 3.8. ReferencesThe specific applications and analyses of the data are described in 1. Jiang, S.; Webb, M.A. "Physics-Guided Neural Networks for Transferable Prediction of Polymer Properties" DataThere are seven .pickle files that contain serialized Python objects. pattern_graph_data_*_*_rg_new.pickle: squared radii of gyration distribution from MD simulation. The number indicates the molecular weight range. rg2_baseline_*_new.pickle: squared radii of gyration distribution from Gaussian chain theoretical prediction. delta_data_v0314.pickle: torch_geometric training data. UsageTo access the data in the .pickle file, users can execute the following: # LOAD SIMULATION DATADATA_DIR = "your/custom/dir/"mw = 40 # or 90, 190 MWs filename = os.path.join(DATA_DIR, f"pattern_graph_data_{mw}_{mw+20}_rg_new.pickle")with open(filename, "rb") as handle: graph = pickle.load(handle) label = pickle.load(handle) desc = pickle.load(handle) meta = pickle.load(handle) mode = pickle.load(handle) rg2_mean = pickle.load(handle) rg2_std = pickle.load(handle) ** 0.5 # var # combine asymmetric and symmetric star polymerslabel[label == 'stara'] = 'star'# combine bottlebrush and other comb polymerslabel[label == 'bottlebrush'] = 'comb' # LOAD GAUSSIAN CHAIN THEORETICAL DATAwith open(os.path.join(DATA_DIR, f"rg2_baseline_{mw}_new.pickle"), "rb") as handle: rg2_mean_theo = pickle.load(handle)[:, 0] rg2_std_theo = pickle.load(handle)[:, 0] graph: NetworkX graph representations of polymers. label: Architectural classes of polymers (e.g., linear, cyclic, star, branch, comb, dendrimer). desc: Topological descriptors (optional). meta: Identifiers for unique architectures (optional). mode: Identifiers for unique chemical patterns (optional). rg2_mean: Mean squared radii of gyration from simulations. rg2_std: Corresponding standard deviation from simulations. rg2_mean_theo: Mean squared radii of gyration from theoretical models. rg2_std_theo: Corresponding standard deviation from theoretical models. Help, Suggestions, Corrections?If you need help, have suggestions, identify issues, or have corrections, please send your comments to Shengli Jiang at sj0161@princeton.edu GitHubAdditional data and code relevant for this study is additionally accessible at https://github.com/webbtheosim/gcgnnmore » « less
-
Dataset Description This dataset contains 6710 structural configurations and solvophobicity values for topologically and chemically diverse coarse-grained polymer chains. Additionally, 480 polymers include shear-rate dependent viscosity profiles at 2 wt% polymer concentration.The data is provided as serialized objects using the pickle Python module.All files were generated using Python version 3.10. Data There are three pickle files containing serialized Python objects. Key files include: data_aug10.pickle Contains the coarse-grained polymer dataset with 6710 entries. Each entry includes: Polymer graph Squared radius of gyration (at lambda = 0). Solvophobicity (lambda). Bead count (N). Chain virial number (Xi). topo_param_visc.pickle Shear-rate-dependent viscosity profiles of 480 polymer systems. target_curves.pickle Contains 30 target viscosity profiles used for active learning. Usage To load the dataset stored in data_aug10.pickle, use the following code: import pickle with open("data_aug10.pickle", "rb") as handle: ( (x_train, y_train, c_train, l_train, graph_train), (x_valid, y_valid, c_valid, l_valid, graph_valid), (x_test, y_test, c_test, l_test, graph_test), NAMES, SCALER, SCALER_y, le ) = pickle.load(handle) x: node features for each polymer graph y: labels (e.g., predicted properties) c: topological class indices l: topological descriptors graph: NetworkX graphs representing polymer topology NAMES: list of topological class names SCALER: fitted scaler for topological descriptors (l) SCALER_y: fitted scaler for property labels (y) le: label encoder for topological class indices To load the dataset stored in topo_param_visc.pickle, use the following code: import pickle with open("poly_data_ml.pickle", "rb") as handle: desc_all, ps_all, curve_all, shear_rate, graph_all = pickle.load(handle) desc_all: topological descriptors for each polymer graph ps_all: fitted Carreau–Yasuda model parameters curve_all: fitted viscosity curves shear_rate: shear rates corresponding to each viscosity curve graph_all: polymer graphs represented as NetworkX objects First 30: seed dataset Next 150: 5 iterations (30 each) from class-balanced space-filling Following 150: space-filling without class balancing Final 150: active learning samples To load the dataset stored in target_curves.pickle, use the following code: import pickle with open("target_curves.pickle", "rb") as handle: data = pickle.load(f) curves = data['curves']params = data['params']shear_rate = data["xx"] curves: target viscosity curves used as design objectives params: Carreau–Yasuda model parameters fitted to the target curves shear_rate: shear rate values associated with the target curves Help, Suggestions, Corrections?If you need help, have suggestions, identify issues, or have corrections, please send your comments to Shengli Jiang at sj0161@princeton.edu GitHubAdditional data and code relevant for this study is additionally accessible at https://github.com/webbtheosim/cg-topo-solvmore » « less
An official website of the United States government
