skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Source code: Basins modulate signatures of river salinization
This resource contains source code and select data products behind the following Master's Thesis: Platt, L. (2024). Basins modulate signatures of river salinization (Master's thesis). University of Wisconsin-Madison, Freshwater and Marine Sciences. The source code represents an R-based data processing and modeling pipeline written using the R package "targets". Some of the folders in the source code zipfile are intentionally left empty (except for a hidden file ".placeholder") in order for the code repository to be setup with the required folder structure. To execute this code, download the zip folder, unzip, and open the salt-modeling-data.Rproj file. Then, reference the instructions in the README.md file for installing packages, building the pipeline, and examining the results. Newer versions of this repository may be updated in GitHub at github.com/lindsayplatt/salt-modeling-data. In addition to the source code, this resource contains three data files containing intermediate products of the pipeline. The first two represent data prepared for the random forest modeling. Data download and processing were completed in pipeline phases 1 - 5, and the random forest modeling was completed in phase 6 (see source code).  site_attributes.csv which contains the USGS gage site numbers and their associated basin attributes site_classifications.csv which contains the classification of a site for both episodic signatures ("Episodic" or "Not episodic") and baseflow salinization signatures ("positive", "none", "negative", or NA). Note that an NA in the baseflow classification column means that the site did not meet minimum data requirements for calculating a trend and was not used in the random forest model for baseflow salinization. site_attribute_details.csv contains a table of each attribute shorthand used as column names in site_attributes.csv and their names, units, description, and data source.  more » « less
Award ID(s):
2144750
PAR ID:
10529147
Author(s) / Creator(s):
;
Publisher / Repository:
Zenodo
Date Published:
Subject(s) / Keyword(s):
salinization rivers watershed road salt random forest
Format(s):
Medium: X
Right(s):
Creative Commons Attribution 4.0 International
Sponsoring Org:
National Science Foundation
More Like this
  1. {"Abstract":["# DeepCaImX## Introduction#### Two-photon calcium imaging provides large-scale recordings of neuronal activities at cellular resolution. A robust, automated and high-speed pipeline to simultaneously segment the spatial footprints of neurons and extract their temporal activity traces while decontaminating them from background, noise and overlapping neurons is highly desirable to analyze calcium imaging data. In this paper, we demonstrate DeepCaImX, an end-to-end deep learning method based on an iterative shrinkage-thresholding algorithm and a long-short-term-memory neural network to achieve the above goals altogether at a very high speed and without any manually tuned hyper-parameters. DeepCaImX is a multi-task, multi-class and multi-label segmentation method composed of a compressed-sensing-inspired neural network with a recurrent layer and fully connected layers. It represents the first neural network that can simultaneously generate accurate neuronal footprints and extract clean neuronal activity traces from calcium imaging data. We trained the neural network with simulated datasets and benchmarked it against existing state-of-the-art methods with in vivo experimental data. DeepCaImX outperforms existing methods in the quality of segmentation and temporal trace extraction as well as processing speed. DeepCaImX is highly scalable and will benefit the analysis of mesoscale calcium imaging. ![alt text](https://github.com/KangningZhang/DeepCaImX/blob/main/imgs/Fig1.png)\n\n## System and Environment Requirements#### 1. Both CPU and GPU are supported to run the code of DeepCaImX. A CUDA compatible GPU is preferred. * In our demo of full-version, we use a GPU of Quadro RTX8000 48GB to accelerate the training speed.* In our demo of mini-version, at least 6 GB momory of GPU/CPU is required.#### 2. Python 3.9 and Tensorflow 2.10.0#### 3. Virtual environment: Anaconda Navigator 2.2.0#### 4. Matlab 2023a\n\n## Demo and installation#### 1 (_Optional_) GPU environment setup. We need a Nvidia parallel computing platform and programming model called _CUDA Toolkit_ and a GPU-accelerated library of primitives for deep neural networks called _CUDA Deep Neural Network library (cuDNN)_ to build up a GPU supported environment for training and testing our model. The link of CUDA installation guide is https://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html and the link of cuDNN installation guide is https://docs.nvidia.com/deeplearning/cudnn/installation/overview.html. #### 2 Install Anaconda. Link of installation guide: https://docs.anaconda.com/free/anaconda/install/index.html#### 3 Launch Anaconda prompt and install Python 3.x and Tensorflow 2.9.0 as the virtual environment.#### 4 Open the virtual environment, and then  pip install mat73, opencv-python, python-time and scipy.#### 5 Download the "DeepCaImX_training_demo.ipynb" in folder "Demo (full-version)" for a full version and the simulated dataset via the google drive link. Then, create and put the training dataset in the path "./Training Dataset/". If there is a limitation on your computing resource or a quick test on our code, we highly recommand download the demo from the folder "Mini-version", which only requires around 6.3 GB momory in training. #### 6 Run: Use Anaconda to launch the virtual environment and open "DeepCaImX_training_demo.ipynb" or "DeepCaImX_testing_demo.ipynb". Then, please check and follow the guide of "DeepCaImX_training_demo.ipynb" or or "DeepCaImX_testing_demo.ipynb" for training or testing respectively.#### Note: Every package can be installed in a few minutes.\n\n## Run DeepCaImX#### 1. Mini-version demo* Download all the documents in the folder of "Demo (mini-version)".* Adding training and testing dataset in the sub-folder of "Training Dataset" and "Testing Dataset" separately.* (Optional) Put pretrained model in the the sub-folder of "Pretrained Model"* Using Anaconda Navigator to launch the virtual environment and opening "DeepCaImX_training_demo.ipynb" for training or "DeepCaImX_testing_demo.ipynb" for predicting.\n\n#### 2. Full-version demo* Download all the documents in the folder of "Demo (full-version)".* Adding training and testing dataset in the sub-folder of "Training Dataset" and "Testing Dataset" separately.* (Optional) Put pretrained model in the the sub-folder of "Pretrained Model"* Using Anaconda Navigator to launch the virtual environment and opening "DeepCaImX_training_demo.ipynb" for training or "DeepCaImX_testing_demo.ipynb" for predicting.\n\n## Data Tailor#### A data tailor developed by Matlab is provided to support a basic data tiling processing. In the folder of "Data Tailor", we can find a "tailor.m" script and an example "test.tiff". After running "tailor.m" by matlab, user is able to choose a "tiff" file from a GUI as loading the sample to be tiled. Settings include size of FOV, overlapping area, normalization option, name of output file and output data format. The output files can be found at local folder, which is at the same folder as the "tailor.m".\n\n## Simulated Dataset#### 1. Dataset generator (FISSA Version): The algorithm for generating simulated dataset is based on the paper of FISSA (_Keemink, S.W., Lowe, S.C., Pakan, J.M.P. et al. FISSA: A neuropil decontamination toolbox for calcium imaging signals. Sci Rep 8, 3493 (2018)_) and SimCalc repository (https://github.com/rochefort-lab/SimCalc/). For the code used to generate the simulated data, please download the documents in the folder "Simulated Dataset Generator". #### Training dataset: https://drive.google.com/file/d/1WZkIE_WA7Qw133t2KtqTESDmxMwsEkjJ/view?usp=share_link#### Testing Dataset: https://drive.google.com/file/d/1zsLH8OQ4kTV7LaqQfbPDuMDuWBcHGWcO/view?usp=share_link\n\n#### 2. Dataset generator (NAOMi Version): The algorithm for generating simulated dataset is based on the paper of NAOMi (_Song, A., Gauthier, J. L., Pillow, J. W., Tank, D. W. & Charles, A. S. Neural anatomy and optical microscopy (NAOMi) simulation for evaluating calcium imaging methods. Journal of neuroscience methods 358, 109173 (2021)_). For the code use to generate the simulated data, please go to this link: https://bitbucket.org/adamshch/naomi_sim/src/master/code/## Experimental Dataset#### We used the samples from ABO dataset:https://github.com/AllenInstitute/AllenSDK/wiki/Use-the-Allen-Brain-Observatory-%E2%80%93-Visual-Coding-on-AWS.#### The segmentation ground truth can be found in the folder "Manually Labelled ROIs". #### The segmentation ground truth of depth 175, 275, 375, 550 and 625 um are manually labeled by us. #### The code for creating ground truth of extracted traces can be found in "Prepro_Exp_Sample.ipynb" in the folder "Preprocessing of Experimental Sample"."]} 
    more » « less
  2. This dataset contains the data used in the paper (arXiv:2301.02398) on the estimation and subtraction of glitches in gravitational wave data using an adaptive spline fitting method called SHAPES . Each .zip file corresponds to one of the glitches considered in the paper. The name of the class to which the glitch belongs (e.g., "Blip") is included in the name of the corresponding .zip file (e.g., BLIP_SHAPESRun_20221229T125928.zip). When uncompressed, each .zip file expands to a folder containing the following. An HDF5 file containing the Whitened gravitational wave (GW) strain data in which the glitch appeared. The data has been whitened using a proprietary code. The original (unwhitened) strain data file is available from gwosc.org. The name of the original data file is the part preceding the token '__dtrndWhtnBndpss' in the name of the file.A JSON file containing information pertinent to the glitch that was analyzed (e.g., start and stop indices in the whitened data time series).A set of .mat  files containing segmented estimates of the glitch as described in the paper.  A MATLAB script, plotglitch.m, has been provided that plots, for a given glitch folder name, the data segment that was analyzed in the paper. Another script, plotshapesestimate.m, plots the estimated glitch. These scripts require the JSONLab package. 
    more » « less
  3. The raw data for the associated manuscript is organized here into three categories: 1) relating to the measurement and analysis of the native recluse spiders loop junctions, 2) raw images found in the figures throughout the manuscript, and 3) relating to the experiments testing the effect that junction angle has on the strength of two intersecting tapes. It is recommended to browse the data files in Tree mode, which will make the files appear in folders reflecting this organization. 1) Loxosceles Loop Junction Images and Analysis The folder titled, SEM Raw Images, has all of the scanning electron microscopy (SEM) images taken of the native recluse loop junctions. Some images are close-ups of individual junctions and others take a broader perspective (macro) of many loop junctions in series. Where possible several close-up images of the individual junctions are accompanied with a macro image. These images were imported into ImageJ where the junction angle was measured. The measurements for all 41 loop junctions observed are in the folder titled, Raw Data Files in the file titled, Loxosceles Loop Junction Angle Measurements.txt. The folder titled, Raw Data Files contains, in addition to the angle measurements, the raw data for analyzing the strength of individual loop junctions. The data is in native MATLAB data format. These datasets include the complete tensile data and the cross-sectional area data for each spiders silk. The MATLAB code titled, Figure_2A_2B_code, processes the raw tensile data from the natural recluse spiders loop junctions. This data is plotted as two representative curves in Figure 2A and as a complete set as a histogram in Figure 2B. The MATLAB code titled, Figure_7_code, processes and plots the loop junction data found in, Loxosceles Loop Junction Angle Measurements.txt and executed the model of a random set of recluse loops. This code can be executed to generate Figure 7. The folder titled, Raw Data Files, must be open in MATLAB to run this code! This code uses the MATLAB function, areacalculation, to calculate the junction area for a given junction angle. 2) Raw Images This folder is organized by the respective figure in the manuscript where each image can be found. Additional metadata for each image can be found accompanying each image. 3) Tensile Data and Analysis This folder contains all of the raw tensile data for all tape-tape junction experiments conducted. All of the tensile data is in the folder titled, Raw Data Test Files. Within this folder is a .txt file for each sample tested. The file names are critical to the figure codes working properly because they contain the information for the junction angle and iterations. The file names are in the format year-month-day_trialnumber_junctionangle.txt. Also in the Raw Data Test Files folder are two functions used within some of the figure codes: fbfill and areacalculation. These functions will be used in the figure codes to properly analyze the data. To generate any figure using the MATLAB code in this folder, first open the code in MATLAB. Then within MATLAB, open the folder Raw Data Test Files. Only with this folder open in MATLAB will the code be able to find the correct raw data .txt files. The rest of the contents of this folder are MATLAB codes for specific figures in the manuscript. The only exception to this is the code titled, surfaceenergy_code, which is executed to calculate the phenomenological surface energy for the tapes used in these experiments. 
    more » « less
  4. {"Abstract":["Data files were used in support of the research paper titled \u201cMitigating RF Jamming Attacks at the Physical Layer with Machine Learning<\/em>" which has been submitted to the IET Communications journal.<\/p>\n\n---------------------------------------------------------------------------------------------<\/p>\n\nAll data was collected using the SDR implementation shown here: https://github.com/mainland/dragonradio/tree/iet-paper. Particularly for antenna state selection, the files developed for this paper are located in 'dragonradio/scripts/:'<\/p>\n\n'ModeSelect.py': class used to defined the antenna state selection algorithm<\/li>'standalone-radio.py': SDR implementation for normal radio operation with reconfigurable antenna<\/li>'standalone-radio-tuning.py': SDR implementation for hyperparameter tunning<\/li>'standalone-radio-onmi.py': SDR implementation for omnidirectional mode only<\/li><\/ul>\n\n---------------------------------------------------------------------------------------------<\/p>\n\nAuthors: Marko Jacovic, Xaime Rivas Rey, Geoffrey Mainland, Kapil R. Dandekar\nContact: krd26@drexel.edu<\/p>\n\n---------------------------------------------------------------------------------------------<\/p>\n\nTop-level directories and content will be described below. Detailed descriptions of experiments performed are provided in the paper.<\/p>\n\n---------------------------------------------------------------------------------------------<\/p>\n\nclassifier_training: files used for training classifiers that are integrated into SDR platform<\/p>\n\n'logs-8-18' directory contains OTA SDR collected log files for each jammer type and under normal operation (including congested and weaklink states)<\/li>'classTrain.py' is the main parser for training the classifiers<\/li>'trainedClassifiers' contains the output classifiers generated by 'classTrain.py'<\/li><\/ul>\n\npost_processing_classifier: contains logs of online classifier outputs and processing script<\/p>\n\n'class' directory contains .csv logs of each RTE and OTA experiment for each jamming and operation scenario<\/li>'classProcess.py' parses the log files and provides classification report and confusion matrix for each multi-class and binary classifiers for each observed scenario - found in 'results->classifier_performance'<\/li><\/ul>\n\npost_processing_mgen: contains MGEN receiver logs and parser<\/p>\n\n'configs' contains JSON files to be used with parser for each experiment<\/li>'mgenLogs' contains MGEN receiver logs for each OTA and RTE experiment described. Within each experiment logs are separated by 'mit' for mitigation used, 'nj' for no jammer, and 'noMit' for no mitigation technique used. File names take the form *_cj_* for constant jammer, *_pj_* for periodic jammer, *_rj_* for reactive jammer, and *_nj_* for no jammer. Performance figures are found in 'results->mitigation_performance'<\/li><\/ul>\n\nray_tracing_emulation: contains files related to Drexel area, Art Museum, and UAV Drexel area validation RTE studies.<\/p>\n\nDirectory contains detailed 'readme.txt' for understanding.<\/li>Please note: the processing files and data logs present in 'validation' folder were developed by Wolfe et al. and should be cited as such, unless explicitly stated differently. \n\tS. Wolfe, S. Begashaw, Y. Liu and K. R. Dandekar, "Adaptive Link Optimization for 802.11 UAV Uplink Using a Reconfigurable Antenna," MILCOM 2018 - 2018 IEEE Military Communications Conference (MILCOM), 2018, pp. 1-6, doi: 10.1109/MILCOM.2018.8599696.<\/li><\/ul>\n\t<\/li><\/ul>\n\nresults: contains results obtained from study<\/p>\n\n'classifier_performance' contains .txt files summarizing binary and multi-class performance of online SDR system. Files obtained using 'post_processing_classifier.'<\/li>'mitigation_performance' contains figures generated by 'post_processing_mgen.'<\/li>'validation' contains RTE and OTA performance comparison obtained by 'ray_tracing_emulation->validation->matlab->outdoor_hover_plots.m'<\/li><\/ul>\n\ntuning_parameter_study: contains the OTA log files for antenna state selection hyperparameter study<\/p>\n\n'dataCollect' contains a folder for each jammer considered in the study, and inside each folder there is a CSV file corresponding to a different configuration of the learning parameters of the reconfigurable antenna. The configuration selected was the one that performed the best across all these experiments and is described in the paper.<\/li>'data_summary.txt'this file contains the summaries from all the CSV files for convenience.<\/li><\/ul>"]} 
    more » « less
  5. This Python script queries the USGS StreamStats Service API for a list of available basin characteristics, and the values for those characteristics, for each input site. The script takes as input a matrix of site identifiers and location coordinates and returns 1) a matrix of values for available basin characteristics obtained from StreamStats for each input location and 2) a matrix of basin characteristic variable names and definitions. To run this script exactly as written, create 3 columns of data in comma-separated format: 1) 'Site,' which are the study site identifiers, 2) 'lonSS,' the longitudinal coordinates, and 3) 'latSS,' the latitudinal coordinates (in decimal degrees). Name the input file 'ssLocs.csv' and store it in a subfolder named 'Data.' Otherwise, the pathnames for input and output files can be modified within the script. The output files 'ssDats.csv' and 'Descriptions.csv' will also be saved to the subfolder 'Data'. Multiple code runs may be necessary to obtain information for all sites; as long as the output file 'ssDats.csv' remains in the 'Data' folder, the script will only query for sites with missing information. If the program returns an error or is unable to obtain data for a site after several attempts, it may be that the input coordinates do not point to a cell defined as water in the StreamStats application. A solution is to check the coordinates manually in the StreamStats web application (http://streamstats.usgs.gov). This script was developed as part of the analysis described in: URycki DR, Good SP, Crump BC, Chadwick J and Jones GD (2020) River Microbiome Composition Reflects Macroscale Climatic and Geomorphic Differences in Headwater Streams. Front. Water 2:574728. doi: 10.3389/frwa.2020.574728 
    more » « less