Dataset Description This dataset contains 6710 structural configurations and solvophobicity values for topologically and chemically diverse coarse-grained polymer chains. Additionally, 480 polymers include shear-rate dependent viscosity profiles at 2 wt% polymer concentration.The data is provided as serialized objects using the pickle Python module.All files were generated using Python version 3.10. Data There are three pickle files containing serialized Python objects. Key files include: data_aug10.pickle Contains the coarse-grained polymer dataset with 6710 entries. Each entry includes: Polymer graph Squared radius of gyration (at lambda = 0). Solvophobicity (lambda). Bead count (N). Chain virial number (Xi). topo_param_visc.pickle Shear-rate-dependent viscosity profiles of 480 polymer systems. target_curves.pickle Contains 30 target viscosity profiles used for active learning. Usage To load the dataset stored in data_aug10.pickle, use the following code: import pickle with open("data_aug10.pickle", "rb") as handle: ( (x_train, y_train, c_train, l_train, graph_train), (x_valid, y_valid, c_valid, l_valid, graph_valid), (x_test, y_test, c_test, l_test, graph_test), NAMES, SCALER, SCALER_y, le ) = pickle.load(handle) x: node features for each polymer graph y: labels (e.g., predicted properties) c: topological class indices l: topological descriptors graph: NetworkX graphs representing polymer topology NAMES: list of topological class names SCALER: fitted scaler for topological descriptors (l) SCALER_y: fitted scaler for property labels (y) le: label encoder for topological class indices To load the dataset stored in topo_param_visc.pickle, use the following code: import pickle with open("poly_data_ml.pickle", "rb") as handle: desc_all, ps_all, curve_all, shear_rate, graph_all = pickle.load(handle) desc_all: topological descriptors for each polymer graph ps_all: fitted Carreau–Yasuda model parameters curve_all: fitted viscosity curves shear_rate: shear rates corresponding to each viscosity curve graph_all: polymer graphs represented as NetworkX objects First 30: seed dataset Next 150: 5 iterations (30 each) from class-balanced space-filling Following 150: space-filling without class balancing Final 150: active learning samples To load the dataset stored in target_curves.pickle, use the following code: import pickle with open("target_curves.pickle", "rb") as handle: data = pickle.load(f) curves = data['curves']params = data['params']shear_rate = data["xx"] curves: target viscosity curves used as design objectives params: Carreau–Yasuda model parameters fitted to the target curves shear_rate: shear rate values associated with the target curves Help, Suggestions, Corrections?If you need help, have suggestions, identify issues, or have corrections, please send your comments to Shengli Jiang at sj0161@princeton.edu GitHubAdditional data and code relevant for this study is additionally accessible at https://github.com/webbtheosim/cg-topo-solv
more »
« less
ToPoRg-18k: dataset of single-chain radii of gyration distribution for 18,450 architecturally diverse and chemically patterned coarse-grained polymers
Revision: This revision includes four independent trajectory values of the ensemble averages of the mean squared radii of gyration and their standard deviations, which can be used to compute statistical measures such as the standard error. This distribution provides access to 18,450 configurations of coarse-grained polymers. The data is provided as a serialized object using the `pickle' Python module and in csv format. The data was compiled using Python version 3.8. ReferencesThe specific applications and analyses of the data are described in 1. Jiang, S.; Webb, M.A. "Physics-Guided Neural Networks for Transferable Prediction of Polymer Properties" DataThere are seven .pickle files that contain serialized Python objects. pattern_graph_data_*_*_rg_new.pickle: squared radii of gyration distribution from MD simulation. The number indicates the molecular weight range. rg2_baseline_*_new.pickle: squared radii of gyration distribution from Gaussian chain theoretical prediction. delta_data_v0314.pickle: torch_geometric training data. UsageTo access the data in the .pickle file, users can execute the following: # LOAD SIMULATION DATADATA_DIR = "your/custom/dir/"mw = 40 # or 90, 190 MWs filename = os.path.join(DATA_DIR, f"pattern_graph_data_{mw}_{mw+20}_rg_new.pickle")with open(filename, "rb") as handle: graph = pickle.load(handle) label = pickle.load(handle) desc = pickle.load(handle) meta = pickle.load(handle) mode = pickle.load(handle) rg2_mean = pickle.load(handle) rg2_std = pickle.load(handle) ** 0.5 # var # combine asymmetric and symmetric star polymerslabel[label == 'stara'] = 'star'# combine bottlebrush and other comb polymerslabel[label == 'bottlebrush'] = 'comb' # LOAD GAUSSIAN CHAIN THEORETICAL DATAwith open(os.path.join(DATA_DIR, f"rg2_baseline_{mw}_new.pickle"), "rb") as handle: rg2_mean_theo = pickle.load(handle)[:, 0] rg2_std_theo = pickle.load(handle)[:, 0] graph: NetworkX graph representations of polymers. label: Architectural classes of polymers (e.g., linear, cyclic, star, branch, comb, dendrimer). desc: Topological descriptors (optional). meta: Identifiers for unique architectures (optional). mode: Identifiers for unique chemical patterns (optional). rg2_mean: Mean squared radii of gyration from simulations. rg2_std: Corresponding standard deviation from simulations. rg2_mean_theo: Mean squared radii of gyration from theoretical models. rg2_std_theo: Corresponding standard deviation from theoretical models. Help, Suggestions, Corrections?If you need help, have suggestions, identify issues, or have corrections, please send your comments to Shengli Jiang at sj0161@princeton.edu GitHubAdditional data and code relevant for this study is additionally accessible at https://github.com/webbtheosim/gcgnn
more »
« less
- Award ID(s):
- 2320649
- PAR ID:
- 10636749
- Publisher / Repository:
- Zenodo
- Date Published:
- Format(s):
- Medium: X
- Right(s):
- Creative Commons Attribution 4.0 International
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
This dataset holds 1036 ternary phase diagrams and how points on the diagram phase separate if they do. The data is provided as a serialized object using the `pickle' Python module. The data was compiled using Python version 3.8. ReferencesThe specific applications and analyses of the data are described in 1. Dhamankar, S.; Jiang, S.; Webb, M.A. "Accelerating Multicomponent Phase-Coexistence Calculations with Physics-informed Neural Networks" UsageTo access the data in the .pickle file, users can execute the following: # LOAD SIMULATION DATADATA_DIR = "your/custom/dir/" filename = os.path.join(DATA_DIR, f"data_clean.pickle")with open(filename, "rb") as handle: (x, y_c, y_r, phase_idx, num_phase, max_phase) = pickle.load(handle) x: Input x = (χ_AB, χ_BC, χ_AC, v_A, v_B, v_C, φ_A, φ_B) ∈ ℝ^8. y_c: Output one-hot encoded classification vector y_c ∈ ℝ^3. y_r: Output equilibrium composition and abundance vector y_r = (φ_A^α, φ_B^α, φ_A^β, φ_B^β, φ_A^γ, φ_B^γ, w^α, w^β, w^γ) ∈ ℝ^9. phase_idx: A single integer indicating which unique phase system it belongs to. num_phase: A single integer indicates the number of equilibrium phases the input splits into. max_phase: A single integer indicates the maximum number of equilibrium phases the system splits into. Help, Suggestions, Corrections?If you need help, have suggestions, identify issues, or have corrections, please send your comments to Shengli Jiang at sj0161@princeton.edu GitHubAdditional data and code relevant for this study is additionally accessible at hthttps://github.com/webbtheosim/ml-ternary-phasemore » « less
-
<h1 id="summary">Summary</h1> <p>Title: Data Release for A search for extremely-high-energy neutrinos and first constraints on the ultra-high-energy cosmic-ray proton fraction with IceCube</p> <p>The IceCube observatory analyzed 12.6 years of data in search of extremely-high-energy (EHE) neutrinos above 5 PeV. The resultant limit of the search (Fig 1), and the effective area of the event selection (Fig 7), are provided in this data release.</p> <h1 id="contents">Contents</h1> <ul> <li><p>README file: this file</p> </li> <li><p><code>differential_limit_and_sensitivity.csv</code>: a comma separated value file, giving the observed experimental differential limit, and sensitivity, of the search as a function of neutrino energy. This is the content of Fig 1 in the paper. The first column is the neutrino energy in GeV. The second column is the limit in units of GeV/cm2/s/sr. The third column is the sensitivity in units of GeV/cm2/s/sr.</p> </li> <li><p><code>effective_area.csv</code>: a comma separated value file, giving the effective area of the search as a function of energy. This is the content of Fig 7 in the paper. The first column is the neutrino energy in GeV. The second column is the total effective area of the search, summed across neutrino flavors, and averaged across neutrinos and antineutrinos, in meters-squared. The third column is the effective area of the search for the average of electron neutrino and electron antineutrinos in units of meters-squared. The fourth column is the same as the third, but for muon-flavor neutrinos. The fifth column is the same as the third and fourth, but for tau-flavor neutrinos.</p> </li> <li><p><code>demo.py</code>: a short python script to demonstrate how to read the files. Run like <code>python demo.py</code>. A standard base python installation is sufficient, as the only dependencies are numpy and matplotlib.</p> </li> </ul> <h1 id="contacts">Contacts</h1> <p>For any questions about this data release, please write to analysis@icecube.wisc.edu</p>more » « less
-
{"Abstract":["The intended use of this archive is to facilitate meta-analysis of the Data Observation Network for Earth (DataONE, [1]). <\/p>\n\nDataONE is a distributed infrastructure that provides information about earth observation data. This dataset was derived from the DataONE network using Preston [2] between 17 October 2018 and 6 November 2018, resolving 335,213 urls at an average retrieval rate of about 5 seconds per url, or 720 files per hour, resulting in a data gzip compressed tar archive of 837.3 MB . <\/p>\n\nThe archive associates 325,757 unique metadata urls [3] to 202,063 unique ecological metadata files [4]. Also, the DataONE search index was captured to establish provenance of how the dataset descriptors were found and acquired. During the creation of the snapshot (or crawl), 15,389 urls [5], or 4.7% of urls, did not successfully resolve. <\/p>\n\nTo facilitate discovery, the record of the Preston snapshot crawl is included in the preston-ls-* files . There files are derived from the rdf/nquad file with hash://sha256/8c67e0741d1c90db54740e08d2e39d91dfd73566ea69c1f2da0d9ab9780a9a9f . This file can also be found in the data.tar.gz at data/8c/67/e0/8c67e0741d1c90db54740e08d2e39d91dfd73566ea69c1f2da0d9ab9780a9a9f/data . For more information about concepts and format, please see [2]. <\/p>\n\nTo extract all EML files from the included Preston archive, first extract the hashes assocated with EML files using:<\/p>\n\ncat preston-ls.tsv.gz | gunzip | grep "Version" | grep -v "deeplinker" | grep -v "query/solr" | cut -f1,3 | tr '\\t' '\\n' | grep "hash://" | sort | uniq > eml-hashes.txt<\/p>\n\nextract data.tar.gz using:<\/p>\n\n~/preston-archive$$ tar xzf data.tar.gz <\/p>\n\nthen use Preston to extract each hash using something like:<\/p>\n\n~/preston-archive$$ preston get hash://sha256/00002d0fc9e35a9194da7dd3d8ce25eddee40740533f5af2397d6708542b9baa\n<eml:eml xmlns:eml="eml://ecoinformatics.org/eml-2.1.1" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:stmml="http://www.xml-cml.org/schema/stmml_1.1" packageId="doi:10.18739/A24P9Q" system="https://arcticdata.io" scope="system" xsi:schemaLocation="eml://ecoinformatics.org/eml-2.1.1 ~/development/eml/eml.xsd">\n <dataset>\n <alternateIdentifier>urn:x-wmo:md:org.aoncadis.www::d76bc3b5-7b19-11e4-8526-00c0f03d5b7c</alternateIdentifier>\n <alternateIdentifier>d76bc3b5-7b19-11e4-8526-00c0f03d5b7c</alternateIdentifier>\n <title>Airglow Image Data 2011 4 of 5</title>\n...<\/p>\n\nAlternatively, without using Preston, you can extract the data using the naming convention:<\/p>\n\ndata/[x]/[y]/[z]/[hash]/data<\/p>\n\nwhere x is the first 2 characters of the hash, y the second 2 characters, z the third 2 characters, and hash the full sha256 content hash of the EML file.<\/p>\n\nFor example, the hash hash://sha256/00002d0fc9e35a9194da7dd3d8ce25eddee40740533f5af2397d6708542b9baa can be found in the file: data/00/00/2d/00002d0fc9e35a9194da7dd3d8ce25eddee40740533f5af2397d6708542b9baa/data . For more information, see [2].<\/p>\n\nThe intended use of this archive is to facilitate meta-analysis of the DataONE dataset network. <\/p>\n\n[1] DataONE, https://www.dataone.org\n[2] https://preston.guoda.bio, https://doi.org/10.5281/zenodo.1410543 . DataONE was crawled via Preston with "preston update -u https://dataone.org".\n[3] cat preston-ls.tsv.gz | gunzip | grep "Version" | grep -v "deeplinker" | grep -v "query/solr" | cut -f1,3 | tr '\\t' '\\n' | grep -v "hash://" | sort | uniq | wc -l\n[4] cat preston-ls.tsv.gz | gunzip | grep "Version" | grep -v "deeplinker" | grep -v "query/solr" | cut -f1,3 | tr '\\t' '\\n' | grep "hash://" | sort | uniq | wc -l\n[5] cat preston-ls.tsv.gz | gunzip | grep "Version" | grep "deeplinker" | grep -v "query/solr" | cut -f1,3 | tr '\\t' '\\n' | grep -v "hash://" | sort | uniq | wc -l<\/p>\n\nThis work is funded in part by grant NSF OAC 1839201 from the National Science Foundation.<\/p>"]}more » « less
-
{"Abstract":["MCMC chains for the GWB analyses performed in the paper "The NANOGrav 15 yr Data Set: Search for Signals from New Physics<\/em>". <\/p>\n\nThe data is provided in pickle format. Each file contains a NumPy array with the MCMC chain (with burn-in already removed), and a dictionary with the model parameters' names as keys and their priors as values. You can load them as<\/p>\n\nmore » « less
with open ('path/to/file.pkl', 'rb') as pick:\n temp = pickle.load(pick)\n\n params = temp[0]\n chain = temp[1]<\/code>\n\nThe naming convention for the files is the following:<\/p>\n\nigw<\/strong>: inflationary Gravitational Waves (GWs)<\/li>sigw: scalar-induced GWs\n\tsigw_box<\/strong>: assumes a box-like feature in the primordial power spectrum.<\/li>sigw_delta<\/strong>: assumes a delta-like feature in the primordial power spectrum.<\/li>sigw_gauss<\/strong>: assumes a Gaussian peak feature in the primordial power spectrum.<\/li><\/ul>\n\t<\/li>pt: cosmological phase transitions\n\tpt_bubble<\/strong>: assumes that the dominant contribution to the GW productions comes from bubble collisions.<\/li>pt_sound<\/strong>: assumes that the dominant contribution to the GW productions comes from sound waves.<\/li><\/ul>\n\t<\/li>stable: stable cosmic strings\n\tstable-c<\/strong>: stable strings emitting GWs only in the form of GW bursts from cusps on closed loops.<\/li>stable-k<\/strong>: stable strings emitting GWs only in the form of GW bursts from kinks on closed loops.<\/li>stable<\/strong>-m<\/strong>: stable strings emitting monochromatic GW at the fundamental frequency.<\/li>stable-n<\/strong>: stable strings described by numerical simulations including GWs from cusps and kinks.<\/li><\/ul>\n\t<\/li>meta: metastable cosmic strings\n\tmeta<\/strong>-l<\/strong>: metastable strings with GW emission from loops only.<\/li>meta-ls<\/strong> metastable strings with GW emission from loops and segments.<\/li><\/ul>\n\t<\/li>super<\/strong>: cosmic superstrings.<\/li>dw: domain walls\n\tdw-sm<\/strong>: domain walls decaying into Standard Model particles.<\/li>dw-dr<\/strong>: domain walls decaying into dark radiation.<\/li><\/ul>\n\t<\/li><\/ul>\n\nFor each model, we provide four files. One for the run where the new-physics signal is assumed to be the only GWB source. One for the run where the new-physics signal is superimposed to the signal from Supermassive Black Hole Binaries (SMBHB), for these files "_bhb" will be appended to the model name. Then, for both these scenarios, in the "compare" folder we provide the files for the hypermodel runs that were used to derive the Bayes' factors.<\/p>\n\nIn addition to chains for the stochastic models, we also provide data for the two deterministic models considered in the paper (ULDM and DM substructures). For the ULDM model, the naming convention of the files is the following (all the ULDM signals are superimposed to the SMBHB signal, see the discussion in the paper for more details)<\/p>\n\nuldm_e<\/strong>: ULDM Earth signal.<\/li>uldm_p: ULDM pulsar signal\n\tuldm_p_cor<\/strong>: correlated limit<\/li>uldm_p_unc<\/strong>: uncorrelated limit<\/li><\/ul>\n\t<\/li>uldm_c: ULDM combined Earth + pulsar signal direct coupling \n\tuldm_c_cor<\/strong>: correlated limit<\/li>uldm_c_unc<\/strong>: uncorrelated limit<\/li><\/ul>\n\t<\/li>uldm_vecB: vector ULDM coupled to the baryon number\n\tuldm_vecB_cor:<\/strong> correlated limit<\/li>uldm_vecB_unc<\/strong>: uncorrelated limit <\/li><\/ul>\n\t<\/li>uldm_vecBL: vector ULDM coupled to B-L\n\tuldm_vecBL_cor:<\/strong> correlated limit<\/li>uldm_vecBL_unc<\/strong>: uncorrelated limit<\/li><\/ul>\n\t<\/li>uldm_c_grav: ULDM combined Earth + pulsar signal for gravitational-only coupling\n\tuldm_c_grav_cor: correlated limit\n\t\tuldm_c_cor_grav_low<\/strong>: low mass region <\/li>uldm_c_cor_grav_mon<\/strong>: monopole region<\/li>uldm_c_cor_grav_low<\/strong>: high mass region<\/li><\/ul>\n\t\t<\/li>uldm_c_unc<\/strong>: uncorrelated limit\n\t\tuldm_c_unc_grav_low<\/strong>: low mass region <\/li>uldm_c_unc_grav_mon<\/strong>: monopole region<\/li>uldm_c_unc_grav_low<\/strong>: high mass region<\/li><\/ul>\n\t\t<\/li><\/ul>\n\t<\/li><\/ul>\n\nFor the substructure (static) model, we provide the chain for the marginalized distribution (as for the ULDM signal, the substructure signal is always superimposed to the SMBHB signal)<\/p>"]}
An official website of the United States government
