{"Abstract":["Data files were used in support of the research paper titled \u201cMitigating RF Jamming Attacks at the Physical Layer with Machine Learning<\/em>" which has been submitted to the IET Communications journal.<\/p>\n\n---------------------------------------------------------------------------------------------<\/p>\n\nAll data was collected using the SDR implementation shown here: https://github.com/mainland/dragonradio/tree/iet-paper. Particularly for antenna state selection, the files developed for this paper are located in 'dragonradio/scripts/:'<\/p>\n\n'ModeSelect.py': class used to defined the antenna state selection algorithm<\/li>'standalone-radio.py': SDR implementation for normal radio operation with reconfigurable antenna<\/li>'standalone-radio-tuning.py': SDR implementation for hyperparameter tunning<\/li>'standalone-radio-onmi.py': SDR implementation for omnidirectional mode only<\/li><\/ul>\n\n---------------------------------------------------------------------------------------------<\/p>\n\nAuthors: Marko Jacovic, Xaime Rivas Rey, Geoffrey Mainland, Kapil R. Dandekar\nContact: krd26@drexel.edu<\/p>\n\n---------------------------------------------------------------------------------------------<\/p>\n\nTop-level directories and content will be described below. Detailed descriptions of experiments performed are provided in the paper.<\/p>\n\n---------------------------------------------------------------------------------------------<\/p>\n\nclassifier_training: files used for training classifiers that are integrated into SDR platform<\/p>\n\n'logs-8-18' directory contains OTA SDR collected log files for each jammer type and under normal operation (including congested and weaklink states)<\/li>'classTrain.py' is the main parser for training the classifiers<\/li>'trainedClassifiers' contains the output classifiers generated by 'classTrain.py'<\/li><\/ul>\n\npost_processing_classifier: contains logs of online classifier outputs and processing script<\/p>\n\n'class' directory contains .csv logs of each RTE and OTA experiment for each jamming and operation scenario<\/li>'classProcess.py' parses the log files and provides classification report and confusion matrix for each multi-class and binary classifiers for each observed scenario - found in 'results->classifier_performance'<\/li><\/ul>\n\npost_processing_mgen: contains MGEN receiver logs and parser<\/p>\n\n'configs' contains JSON files to be used with parser for each experiment<\/li>'mgenLogs' contains MGEN receiver logs for each OTA and RTE experiment described. Within each experiment logs are separated by 'mit' for mitigation used, 'nj' for no jammer, and 'noMit' for no mitigation technique used. File names take the form *_cj_* for constant jammer, *_pj_* for periodic jammer, *_rj_* for reactive jammer, and *_nj_* for no jammer. Performance figures are found in 'results->mitigation_performance'<\/li><\/ul>\n\nray_tracing_emulation: contains files related to Drexel area, Art Museum, and UAV Drexel area validation RTE studies.<\/p>\n\nDirectory contains detailed 'readme.txt' for understanding.<\/li>Please note: the processing files and data logs present in 'validation' folder were developed by Wolfe et al. and should be cited as such, unless explicitly stated differently. \n\tS. Wolfe, S. Begashaw, Y. Liu and K. R. Dandekar, "Adaptive Link Optimization for 802.11 UAV Uplink Using a Reconfigurable Antenna," MILCOM 2018 - 2018 IEEE Military Communications Conference (MILCOM), 2018, pp. 1-6, doi: 10.1109/MILCOM.2018.8599696.<\/li><\/ul>\n\t<\/li><\/ul>\n\nresults: contains results obtained from study<\/p>\n\n'classifier_performance' contains .txt files summarizing binary and multi-class performance of online SDR system. Files obtained using 'post_processing_classifier.'<\/li>'mitigation_performance' contains figures generated by 'post_processing_mgen.'<\/li>'validation' contains RTE and OTA performance comparison obtained by 'ray_tracing_emulation->validation->matlab->outdoor_hover_plots.m'<\/li><\/ul>\n\ntuning_parameter_study: contains the OTA log files for antenna state selection hyperparameter study<\/p>\n\n'dataCollect' contains a folder for each jammer considered in the study, and inside each folder there is a CSV file corresponding to a different configuration of the learning parameters of the reconfigurable antenna. The configuration selected was the one that performed the best across all these experiments and is described in the paper.<\/li>'data_summary.txt'this file contains the summaries from all the CSV files for convenience.<\/li><\/ul>"]}
more »
« less
Software Environments in Binder Containers
{"Abstract":["Binder is a publicly accessible online service for executing interactive notebooks based on Git repositories. Binder dynamically builds and deploys containers following a recipe stored in the repository, then gives the user a browser-based notebook interface. The Binder group periodically releases a log of container launches from the public Binder service. Archives of launch records are available here. These records do not include identifiable information like IP addresses, but do give the source repo being launched along with some other metadata. The main content of this dataset is in the
binder.sqlite<\/code> file. This SQLite database includes launch records from 2018-11-03 to 2021-06-06 in the events<\/code> table, which has the following schema.<\/p>\n\nCREATE TABLE events(\n version INTEGER,\n timestamp TEXT,\n provider TEXT,\n spec TEXT,\n origin TEXT,\n ref TEXT,\n guessed_ref TEXT\n);\nCREATE INDEX idx_timestamp ON events(timestamp);\n<\/code>\n\n
version<\/code> indicates the version of the record as assigned by Binder. The origin<\/code> field became available with version 3, and the ref<\/code> field with version 4. Older records where this information was not recorded will have the corresponding fields set to null.<\/li>
timestamp<\/code> is the ISO timestamp of the launch<\/li>
provider<\/code> gives the type of source repo being launched ("GitHub" is by far the most common). The rest of the explanations assume GitHub, other providers may differ.<\/li>
spec<\/code> gives the particular branch/release/commit being built. It consists of <github-id>/<repo>/<branch><\/code>.<\/li>
origin<\/code> indicates which backend was used. Each has its own storage, compute, etc. so this info might be important for evaluating caching and performance. Note that only recent records include this field. May be null.<\/li>
ref<\/code> specifies the git commit that was actually used, rather than the named branch referenced by spec<\/code>. Note that this was not recorded from the beginning, so only the more recent entries include it. May be null.<\/li>
For records where ref<\/code> is not available, we attempted to clone the named reference given by spec<\/code> rather than the specific commit (see below). The guessed_ref<\/code> field records the commit found at the time of cloning. If the branch was updated since the container was launched, this will not be the exact version that was used, and instead will refer to whatever was available at the time (early 2021). Depending on the application, this might still be useful information. Selecting only records with version 4 (or non-null ref<\/code>) will exclude these guessed commits. May be null.<\/li><\/ul>\n\nThe Binder launch dataset identifies the source repos that were used, but doesn't give any indication of their contents. We crawled GitHub to get the actual specification files in the repos which were fed into repo2docker when preparing the notebook environments, as well as filesystem metadata of the repos. Some repos were deleted/made private at some point, and were thus skipped. This is indicated by the absence of any row for the given commit (or absence of both ref<\/code> and guessed_ref<\/code> in the events<\/code> table). The schema is as follows.<\/p>\n\nCREATE TABLE spec_files (\n ref TEXT NOT NULL PRIMARY KEY,\n ls TEXT,\n runtime BLOB,\n apt BLOB,\n conda BLOB,\n pip BLOB,\n pipfile BLOB,\n julia BLOB,\n r BLOB,\n nix BLOB,\n docker BLOB,\n setup BLOB,\n postbuild BLOB,\n start BLOB\n);<\/code>\n\n
Here ref<\/code> corresponds to ref<\/code> and/or guessed_ref<\/code> from the events<\/code> table. For each repo, we collected spec files into the following fields (see the repo2docker docs for details on what these are). The records in the database are simply the verbatim file contents, with no parsing or further processing performed.<\/p>\n\n
runtime<\/code>: runtime.txt<\/code><\/li>
apt<\/code>: apt.txt<\/code><\/li>
conda<\/code>: environment.yml<\/code><\/li>
pip<\/code>: requirements.txt<\/code><\/li>
pipfile<\/code>: Pipfile.lock<\/code> or Pipfile<\/code><\/li>
julia<\/code>: Project.toml<\/code> or REQUIRE<\/code><\/li>
r<\/code>: install.R<\/code><\/li>
nix<\/code>: default.nix<\/code><\/li>
docker<\/code>: Dockerfile<\/code><\/li>
setup<\/code>: setup.py<\/code><\/li>
postbuild<\/code>: postBuild<\/code><\/li>
start<\/code>: start<\/code><\/li><\/ul>\n\nThe ls<\/code> field gives a metadata listing of the repo contents (excluding the .git<\/code> directory). This field is JSON encoded with the following structure based on JSON types:<\/p>\n\n
Object: filesystem directory. Keys are file names within it. Values are the contents, which can be regular files, symlinks, or subdirectories.<\/li>
String: symlink. The string value gives the link target.<\/li>
Number: regular file. The number value gives the file size in bytes.<\/li><\/ul>\n\nCREATE TABLE clean_specs (\n ref TEXT NOT NULL PRIMARY KEY,\n conda_channels TEXT,\n conda_packages TEXT,\n pip_packages TEXT,\n apt_packages TEXT\n);<\/code>\n\nThe clean_specs<\/code> table provides parsed and validated specifications for some of the specification files (currently Pip, Conda, and APT packages). Each column gives either a JSON encoded list of package requirements, or null. APT packages have been validated using a regex adapted from the repo2docker source. Pip packages have been parsed and normalized using the Requirement class from the pkg_resources package of setuptools. Conda packages have been parsed and normalized using the conda.models.match_spec.MatchSpec<\/code> class included with the library form of Conda (distinct from the command line tool). Users might want to use these parsers when working with the package data, as the specifications can become fairly complex.<\/p>\n\n
The missing<\/code> table gives the repos that were not accessible, and event_logs<\/code> records which log files have already been added. These tables are used for updating the dataset and should not be of interest to users.<\/p>"]}
more »
« less
- Award ID(s):
- 1931348
- PAR ID:
- 10356920
- Publisher / Repository:
- Zenodo
- Date Published:
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
{"Abstract":["MCMC chains for the GWB analyses performed in the paper "The NANOGrav 15 yr Data Set: Search for Signals from New Physics<\/em>". <\/p>\n\nThe data is provided in pickle format. Each file contains a NumPy array with the MCMC chain (with burn-in already removed), and a dictionary with the model parameters' names as keys and their priors as values. You can load them as<\/p>\n\nmore » « less
with open ('path/to/file.pkl', 'rb') as pick:\n temp = pickle.load(pick)\n\n params = temp[0]\n chain = temp[1]<\/code>\n\n
The naming convention for the files is the following:<\/p>\n\n
igw<\/strong>: inflationary Gravitational Waves (GWs)<\/li>
sigw: scalar-induced GWs\n\tsigw_box<\/strong>: assumes a box-like feature in the primordial power spectrum.<\/li>sigw_delta<\/strong>: assumes a delta-like feature in the primordial power spectrum.<\/li>sigw_gauss<\/strong>: assumes a Gaussian peak feature in the primordial power spectrum.<\/li><\/ul>\n\t<\/li>pt: cosmological phase transitions\n\tpt_bubble<\/strong>: assumes that the dominant contribution to the GW productions comes from bubble collisions.<\/li>pt_sound<\/strong>: assumes that the dominant contribution to the GW productions comes from sound waves.<\/li><\/ul>\n\t<\/li>stable: stable cosmic strings\n\tstable-c<\/strong>: stable strings emitting GWs only in the form of GW bursts from cusps on closed loops.<\/li>stable-k<\/strong>: stable strings emitting GWs only in the form of GW bursts from kinks on closed loops.<\/li>stable<\/strong>-m<\/strong>: stable strings emitting monochromatic GW at the fundamental frequency.<\/li>stable-n<\/strong>: stable strings described by numerical simulations including GWs from cusps and kinks.<\/li><\/ul>\n\t<\/li>meta: metastable cosmic strings\n\tmeta<\/strong>-l<\/strong>: metastable strings with GW emission from loops only.<\/li>meta-ls<\/strong> metastable strings with GW emission from loops and segments.<\/li><\/ul>\n\t<\/li>super<\/strong>: cosmic superstrings.<\/li>dw: domain walls\n\tdw-sm<\/strong>: domain walls decaying into Standard Model particles.<\/li>dw-dr<\/strong>: domain walls decaying into dark radiation.<\/li><\/ul>\n\t<\/li><\/ul>\n\nFor each model, we provide four files. One for the run where the new-physics signal is assumed to be the only GWB source. One for the run where the new-physics signal is superimposed to the signal from Supermassive Black Hole Binaries (SMBHB), for these files "_bhb" will be appended to the model name. Then, for both these scenarios, in the "compare" folder we provide the files for the hypermodel runs that were used to derive the Bayes' factors.<\/p>\n\nIn addition to chains for the stochastic models, we also provide data for the two deterministic models considered in the paper (ULDM and DM substructures). For the ULDM model, the naming convention of the files is the following (all the ULDM signals are superimposed to the SMBHB signal, see the discussion in the paper for more details)<\/p>\n\nuldm_e<\/strong>: ULDM Earth signal.<\/li>uldm_p: ULDM pulsar signal\n\tuldm_p_cor<\/strong>: correlated limit<\/li>uldm_p_unc<\/strong>: uncorrelated limit<\/li><\/ul>\n\t<\/li>uldm_c: ULDM combined Earth + pulsar signal direct coupling \n\tuldm_c_cor<\/strong>: correlated limit<\/li>uldm_c_unc<\/strong>: uncorrelated limit<\/li><\/ul>\n\t<\/li>uldm_vecB: vector ULDM coupled to the baryon number\n\tuldm_vecB_cor:<\/strong> correlated limit<\/li>uldm_vecB_unc<\/strong>: uncorrelated limit <\/li><\/ul>\n\t<\/li>uldm_vecBL: vector ULDM coupled to B-L\n\tuldm_vecBL_cor:<\/strong> correlated limit<\/li>uldm_vecBL_unc<\/strong>: uncorrelated limit<\/li><\/ul>\n\t<\/li>uldm_c_grav: ULDM combined Earth + pulsar signal for gravitational-only coupling\n\tuldm_c_grav_cor: correlated limit\n\t\tuldm_c_cor_grav_low<\/strong>: low mass region <\/li>uldm_c_cor_grav_mon<\/strong>: monopole region<\/li>uldm_c_cor_grav_low<\/strong>: high mass region<\/li><\/ul>\n\t\t<\/li>uldm_c_unc<\/strong>: uncorrelated limit\n\t\tuldm_c_unc_grav_low<\/strong>: low mass region <\/li>uldm_c_unc_grav_mon<\/strong>: monopole region<\/li>uldm_c_unc_grav_low<\/strong>: high mass region<\/li><\/ul>\n\t\t<\/li><\/ul>\n\t<\/li><\/ul>\n\nFor the substructure (static) model, we provide the chain for the marginalized distribution (as for the ULDM signal, the substructure signal is always superimposed to the SMBHB signal)<\/p>"]}
-
The historical settlement data compilation for Spain (HISDAC-ES) is a geospatial dataset consisting of over 240 gridded surfaces measuring the physical, functional, age-related, and evolutionary characteristics of the Spanish building stock. We scraped, harmonized, and aggregated cadastral building footprint data for Spain, covering over 12,000,000 building footprints including construction year attributes, to create a multi-faceted series of gridded surfaces (GeoTIFF format), describing the evolution of human settlements in Spain from 1900 to 2020, at 100m spatial and 5 years temporal resolution. Also, the dataset contains aggregated characteristics and completeness statistics at the municipality level, in CSV and GeoPackage format.!!! UPDATE 08-2023 !!!: We provide a new, improved version of HISDAC-ES. Specifically, we fixed two bugs in the production code that caused an incorrect rasterization of the multitemporal BUFA layers and of the PHYS layers (BUFA, BIA, DWEL, BUNITS sum and mean). Moreover, we added decadal raster datasets measuring residential building footprint and building indoor area (1900-2020), and provide a country-wide, harmonized building footprint centroid dataset in GeoPackage vector data format.File descriptions:Datasets are available in three spatial reference systems:HISDAC-ES_All_LAEA.zip: Raster data in Lambert Azimuthal Equal Area (LAEA) covering all Spanish territory.HISDAC-ES_IbericPeninsula_UTM30.zip: Raster data in UTM Zone 30N covering all the Iberic Peninsula + Céuta and Melilla.HISDAC-ES_CanaryIslands_REGCAN.zip: Raster data in REGCAN-95, covering the Canary Islands only.HISDAC-ES_MunicipAggregates.zip: Municipality-level aggregates and completeness statistics (CSV, GeoPackage), in LAEA projection.ES_building_centroids_merged_spatjoin.gpkg: 7,000,000+ building footprint centroids in GeoPackage format, harmonized from the different cadastral systems, representing the input data for HISDAC-ES. These data can be used for sanity checks or for the creation of further, user-defined gridded surfaces.Source data:HISDAC-ES is derived from cadastral building footprint data, available from different authorities in Spain:Araba province: https://geo.araba.eus/WFS_Katastroa?SERVICE=WFS&VERSION=1.1.0&REQUEST=GetCapabilitiesBizkaia province: https://web.bizkaia.eus/es/inspirebizkaiaGipuzkoa province: https://b5m.gipuzkoa.eus/web5000/es/utilidades/inspire/edificios/Navarra region: https://inspire.navarra.es/services/BU/wfsOther regions: http://www.catastro.minhap.es/INSPIRE/buildings/ES.SDGC.bu.atom.xmlData source of municipality polygons: Centro Nacional de Información Geográfica (https://centrodedescargas.cnig.es/CentroDescargas/index.jsp)Technical notes:Gridded dataFile nomenclature:./region_projection_theme/hisdac_es_theme_variable_version_resolution[m][_year].tifRegions:all: complete territory of Spaincan: Canarian Islands onlyibe: Iberic peninsula + Céuta + MelillaProjections:laea: Lambert azimuthal equal area (EPSG:3035)regcan: REGCAN95 / UTM zone 28N (EPSG:4083)utm: ETRS89 / UTM zone 30N (EPSG:25830)Themes:evolution / evol: multi-temporal physical measurementslanduse: multi-temporal building counts per land use (i.e., building function) classphysical / phys: physical building characteristics in 2020temporal / temp: temporal characteristics (construction year statistics)Variables: evolutionbudens: building density (count per grid cell area)bufa: building footprint areadeva: developed area (any grid cell containing at least one building)resbufa: residential building footprint arearesbia: residential building indoor areaVariables: physicalbia: building indoor areabufa: building footprint areabunits: number of building unitsdwel: number of dwellingsVariables: temporalmincoy: minimum construction year per grid cellmaxcoy: minimum construction year per grid cellmeancoy: mean construction year per grid cellmedcoy: median construction year per grid cellmodecoy: mode (most frequent) construction year per grid cellvarcoy: variety of construction years per grid cellVariable: landuseCounts of buildings per grid cell and land use type.Municipality-level datahisdac_es_municipality_stats_multitemporal_longform_v1.csv: This CSV file contains the zonal sums of the gridded surfaces (e.g., number of buildings per year and municipality) in long form. Note that a value of 0 for the year attribute denotes the statistics for records without construction year information.hisdac_es_municipality_stats_multitemporal_wideform_v1.csv: This CSV file contains the zonal sums of the gridded surfaces (e.g., number of buildings per year and municipality) in wide form. Note that a value of 0 for the year suffix denotes the statistics for records without construction year information.hisdac_es_municipality_stats_completeness_v1.csv: This CSV file contains the missingness rates (in %) of the building attribute per municipality, ranging from 0.0 (attribute exists for all buildings) to 100.0 (attribute exists for none of the buildings) in a given municipality.Column names for the completeness statistics tables:NATCODE: National municipality identifier*num_total: number of buildings per municperc_bymiss: Percentage of buildings with missing built year (construction year)perc_lumiss: Percentage of buildings with missing landuse attributeperc_luother: Percentage of buildings with landuse type "other"perc_num_floors_miss: Percentage of buildings without valid number of floors attributeperc_num_dwel_miss: Percentage of buildings without valid number of dwellings attributeperc_num_bunits_miss: Percentage of buildings without valid number of building units attributeperc_offi_area_miss: Percentage of buildings without valid official area (building indoor area, BIA) attributeperc_num_dwel_and_num_bunits_miss: Percentage of buildings missing both number of dwellings and number of building units attributeThe same statistics are available as geopackage file including municipality polygons in Lambert azimuthal equal area (EPSG:3035).*From the NATCODE, other regional identifiers can be derived as follows:NATCODE: 34 01 04 04001Country: 34Comunidad autónoma (CA_CODE): 01Province (PROV_CODE): 04LAU code: 04001 (province + municipality code)more » « less
-
This dataset contains the data used in the paper (arXiv:2301.02398) on the estimation and subtraction of glitches in gravitational wave data using an adaptive spline fitting method called SHAPES . Each .zip file corresponds to one of the glitches considered in the paper. The name of the class to which the glitch belongs (e.g., "Blip") is included in the name of the corresponding .zip file (e.g., BLIP_SHAPESRun_20221229T125928.zip). When uncompressed, each .zip file expands to a folder containing the following. An HDF5 file containing the Whitened gravitational wave (GW) strain data in which the glitch appeared. The data has been whitened using a proprietary code. The original (unwhitened) strain data file is available from gwosc.org. The name of the original data file is the part preceding the token '__dtrndWhtnBndpss' in the name of the file.A JSON file containing information pertinent to the glitch that was analyzed (e.g., start and stop indices in the whitened data time series).A set of .mat files containing segmented estimates of the glitch as described in the paper. A MATLAB script, plotglitch.m, has been provided that plots, for a given glitch folder name, the data segment that was analyzed in the paper. Another script, plotshapesestimate.m, plots the estimated glitch. These scripts require the JSONLab package.more » « less
-
This dataset consists of 800 coordinate files (in the CHARMM psf/cor format) for the QM/MM minimum energy pathways of the acylation reactions between a Class A beta-lactamases (Toho-1) and two beta-lactam antibiotic molecules (ampicillin and cefalexin).</p> These files are:</p> toho_amp.r1-ae.zip: The R1-AE acylation pathways for Toho-1/Ampicillin (200 pathways);</li>toho_amp.r2-ae.zip: The R2-AE acylation pathways for Toho-1/Ampicillin (200 pathways);</li>toho_cex.r1-ae.zip: The R1-AE acylation pathways for Toho-1/Cefalexin (200 pathways);</li>toho_cex.r2-ae.zip: The R2-AE acylation pathways for Toho-1/Cefalexin (200 pathways);</li>energies.zip: the replica energies at B3LYP-D3/6-31+G**/C36 level;</li>chelpgs.zip: the ChElPG charges of all reactant replicas at B3LYP-D3/6-31+G**/C36 level;</li>farrys.zip: the featurzied NumPy arrays for model training;</li>peephole.zip: an example file for how the optimized MEPs look like; </li>dftb3_benchmark.zip: the reference calculations to justify the use of DFTB3/3OB-F/C36 in MEP optimizations, the reference level of theory is B3LYP-D3/6-31G**/C36. </li></ul> The R1-AE pathways are the acylation uses Glu166 as the general base; the R2-AE pathways uses Lys73 and Glu166 as the concerted base. </p> All QM/MM pathways are optimized at the DFTB3/3OB-f/CHARMM36 level of theory. </p> Z. Song et al Mechanistic Insights into Enzyme Catalysis from Explaining Machine-Learned Quantum Mechanical and Molecular Mechanical Minimum Energy Pathways. ACS Physical Chemistry Au, in press. DOI: 10.1021/acsphyschemau.2c00005</p>more » « less