skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Open-source Software Governance Documentation Dataset on GitHub
This dataset contains 710 GitHub-hosted OSS projects, which contain a governance file in the root directory of the project. It also contains commits, issues, and comments on each project.</p>  more » « less
Award ID(s):
2217652 2020751
PAR ID:
10423391
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
Zenodo
Date Published:
Edition / Version:
1.0.0
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Data from: Stone and Wessinger 2023, "Ecological diversification in an adaptive radiation of plants: the role of de novo mutation and introgression"DOI: 10.1101/2023.11.01.565185The code used to conduct analyses from this study can be found here: https://github.com/benstemon/MBE-23-0936The raw sequencing reads generated from this study have been deposited on the SRA under Project number: PRJNA1057825This repository contains a README.md file, which contains information on all files included. 
    more » « less
  2. This item contains version 5.0</strong> of the Madidi Project's full dataset. The zip file contains (1) raw data, which was downloaded from Tropicos (www.tropicos.org) on August 18, 2020; (2) R scripts used to modify, correct, and clean the raw data; (3) clean data that are the output of the R scripts, and which are the point of departure for most uses of the Madidi Dataset; (4) post-cleaning scripts that obtain additional but non-essential information from the clean data (e.g. by extracting environmental data from rasters); and (5) a miscellaneous collection of additional non-essential information and figures. This item also includes the Data Use Policy</strong> for this dataset.</p> The core dataset of the Madidi Project consists of a network of ~500 forest plots distributed in and around the Madidi National Park in Bolivia. This network contains 50 permanently marked large plots (1-ha), as well as >450 temporary small plots (0.1-ha). Within the large plots, all woody individuals with a dbh ≥10 cm have been mapped, tagged, measured, and identified. Some of these plots have also been re-visited and information on mortality, recruitment, and growth exists. Within the small plots, all woody individuals with a dbh ≥2.5 cm have been measured and identified. Each plot has some edaphic and topographic information, and some large plots have information on various plant functional traits.</p> The Madidi Project is a collaborative research effort to document and study plant biodiversity in the Amazonia and Tropical Andes of northwestern Bolivia. The project is currently lead by the Missouri Botanical Garden (MBG), in collaboration with the Herbario Nacional de Bolivia. The management of the project is at MBG, where J. Sebastian Tello (sebastian.tello@mobot.org) is the scientific director. The director oversees the activities of a research team based in Bolivia. MBG works in collaboration with other data contributors (currently: Manuel J. Macía [manuel.macia@uam.es], Gabriel Arellano [gabriel.arellano.torres@gmail.com] and Beatriz Nieto [sonneratia@gmail.com]), with a representative from the Herbario Nacional de Bolivia (LPB; Carla Maldonado [carla.maldonado1@gmail.com]), as well as with other close associated researchers from various institutions. For more information regarding the organization and objectives of the Madidi Project, you can visit the project’s website (www.madidiproject.weebly.com</strong>).</p> The Madidi project has been supported by generous grants from the National Science Foundation (DEB 0101775, DEB 0743457, DEB 1836353), and the National Geographic Society (NGS 7754-04 and NGS 8047-06). Additional financial support for the Madidi Project has been provided by the Missouri Botanical Garden, the Comunidad de Madrid (Spain), the Universidad Autónima de Madrid, and the Taylor and Davidson families. 
    more » « less
  3. This repository contains our raw datasets from channel measurements performed at the University of Utah campus. In addition, we have included a document that explains the setup and methodology used to collect this data, as well as a very brief discussion of results.  File organization: * documentation/ - Contains a .docx with the description of the setup and evaluation. * data/ - HDF5 files containing both metadata and raw IQ samples for each location at which data was collected. Notice we collected data at 14  different client locations. See map in the attached docx (skipped locations 12 and 16). We deployed 5 different receivers at 5 different rooftops. Due to resource constraints, one set of files contains data from 4 different locations whereas another set  contains information from the single remaining location. We have developed a set of python scripts that allow us to parse and analyze the data. Although not included here, they can be found in our public repository: https://github.com/renew-wireless/RENEWLab You can find the top script here.</p> For more information on the POWDER-RENEW project please visit the POWDER website. The RENEW part of the project focuses on the deployment of an open-source massive MIMO system. Please visit our website for more information.</p> 
    more » « less
  4. EvoSL is the first large redistributable corpus of open-source Simulink models that contains project change histories. EvoSL contains 924 Git repositories from GitHub with their 3k issues, 2k pull requests, 10k comments, over 100k commits, and 2M element-level changes extracted from 14k Simulink model snapshots. 
    more » « less
  5. {"Abstract":["This dataset contains machine learning and volunteer classifications from the Gravity Spy project. It includes glitches from observing runs O1, O2, O3a and O3b that received at least one classification from a registered volunteer in the project. It also indicates glitches that are nominally retired from the project using our default set of retirement parameters, which are described below. See more details in the Gravity Spy Methods paper. <\/p>\n\nWhen a particular subject in a citizen science project (in this case, glitches from the LIGO datastream) is deemed to be classified sufficiently it is "retired" from the project. For the Gravity Spy project, retirement depends on a combination of both volunteer and machine learning classifications, and a number of parameterizations affect how quickly glitches get retired. For this dataset, we use a default set of retirement parameters, the most important of which are: <\/p>\n\nA glitches must be classified by at least 2 registered volunteers<\/li>Based on both the initial machine learning classification and volunteer classifications, the glitch has more than a 90% probability of residing in a particular class<\/li>Each volunteer classification (weighted by that volunteer's confusion matrix) contains a weight equal to the initial machine learning score when determining the final probability<\/li><\/ol>\n\nThe choice of these and other parameterization will affect the accuracy of the retired dataset as well as the number of glitches that are retired, and will be explored in detail in an upcoming publication (Zevin et al. in prep). <\/p>\n\nThe dataset can be read in using e.g. Pandas: \n```\nimport pandas as pd\ndataset = pd.read_hdf('retired_fulldata_min2_max50_ret0p9.hdf5', key='image_db')\n```\nEach row in the dataframe contains information about a particular glitch in the Gravity Spy dataset. <\/p>\n\nDescription of series in dataframe<\/strong><\/p>\n\n['1080Lines', '1400Ripples', 'Air_Compressor', 'Blip', 'Chirp', 'Extremely_Loud', 'Helix', 'Koi_Fish', 'Light_Modulation', 'Low_Frequency_Burst', 'Low_Frequency_Lines', 'No_Glitch', 'None_of_the_Above', 'Paired_Doves', 'Power_Line', 'Repeating_Blips', 'Scattered_Light', 'Scratchy', 'Tomte', 'Violin_Mode', 'Wandering_Line', 'Whistle']\n\tMachine learning scores for each glitch class in the trained model, which for a particular glitch will sum to unity<\/li><\/ul>\n\t<\/li>['ml_confidence', 'ml_label']\n\tHighest machine learning confidence score across all classes for a particular glitch, and the class associated with this score<\/li><\/ul>\n\t<\/li>['gravityspy_id', 'id']\n\tUnique identified for each glitch on the Zooniverse platform ('gravityspy_id') and in the Gravity Spy project ('id'), which can be used to link a particular glitch to the full Gravity Spy dataset (which contains GPS times among many other descriptors)<\/li><\/ul>\n\t<\/li>['retired']\n\tMarks whether the glitch is retired using our default set of retirement parameters (1=retired, 0=not retired)<\/li><\/ul>\n\t<\/li>['Nclassifications']\n\tThe total number of classifications performed by registered volunteers on this glitch<\/li><\/ul>\n\t<\/li>['final_score', 'final_label']\n\tThe final score (weighted combination of machine learning and volunteer classifications) and the most probable type of glitch<\/li><\/ul>\n\t<\/li>['tracks']\n\tArray of classification weights that were added to each glitch category due to each volunteer's classification<\/li><\/ul>\n\t<\/li><\/ul>\n\n <\/p>\n\n```\nFor machine learning classifications on all glitches in O1, O2, O3a, and O3b, please see Gravity Spy Machine Learning Classifications on Zenodo<\/p>\n\nFor the most recently uploaded training set used in Gravity Spy machine learning algorithms, please see Gravity Spy Training Set on Zenodo.<\/p>\n\nFor detailed information on the training set used for the original Gravity Spy machine learning paper, please see Machine learning for Gravity Spy: Glitch classification and dataset on Zenodo. <\/p>"]} 
    more » « less