skip to main content


Search for: All records

Award ID contains: 2106882

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. This dataset contains machine learning and volunteer classifications from the Gravity Spy project. It includes glitches from observing runs O1, O2, O3a and O3b that received at least one classification from a registered volunteer in the project. It also indicates glitches that are nominally retired from the project using our default set of retirement parameters, which are described below. See more details in the Gravity Spy Methods paper. 

    When a particular subject in a citizen science project (in this case, glitches from the LIGO datastream) is deemed to be classified sufficiently it is "retired" from the project. For the Gravity Spy project, retirement depends on a combination of both volunteer and machine learning classifications, and a number of parameterizations affect how quickly glitches get retired. For this dataset, we use a default set of retirement parameters, the most important of which are: 

    1. A glitches must be classified by at least 2 registered volunteers
    2. Based on both the initial machine learning classification and volunteer classifications, the glitch has more than a 90% probability of residing in a particular class
    3. Each volunteer classification (weighted by that volunteer's confusion matrix) contains a weight equal to the initial machine learning score when determining the final probability

    The choice of these and other parameterization will affect the accuracy of the retired dataset as well as the number of glitches that are retired, and will be explored in detail in an upcoming publication (Zevin et al. in prep). 

    The dataset can be read in using e.g. Pandas: 
    ```
    import pandas as pd
    dataset = pd.read_hdf('retired_fulldata_min2_max50_ret0p9.hdf5', key='image_db')
    ```
    Each row in the dataframe contains information about a particular glitch in the Gravity Spy dataset. 

    Description of series in dataframe

    • ['1080Lines', '1400Ripples', 'Air_Compressor', 'Blip', 'Chirp', 'Extremely_Loud', 'Helix', 'Koi_Fish', 'Light_Modulation', 'Low_Frequency_Burst', 'Low_Frequency_Lines', 'No_Glitch', 'None_of_the_Above', 'Paired_Doves', 'Power_Line', 'Repeating_Blips', 'Scattered_Light', 'Scratchy', 'Tomte', 'Violin_Mode', 'Wandering_Line', 'Whistle']
      • Machine learning scores for each glitch class in the trained model, which for a particular glitch will sum to unity
    • ['ml_confidence', 'ml_label']
      • Highest machine learning confidence score across all classes for a particular glitch, and the class associated with this score
    • ['gravityspy_id', 'id']
      • Unique identified for each glitch on the Zooniverse platform ('gravityspy_id') and in the Gravity Spy project ('id'), which can be used to link a particular glitch to the full Gravity Spy dataset (which contains GPS times among many other descriptors)
    • ['retired']
      • Marks whether the glitch is retired using our default set of retirement parameters (1=retired, 0=not retired)
    • ['Nclassifications']
      • The total number of classifications performed by registered volunteers on this glitch
    • ['final_score', 'final_label']
      • The final score (weighted combination of machine learning and volunteer classifications) and the most probable type of glitch
    • ['tracks']
      • Array of classification weights that were added to each glitch category due to each volunteer's classification

     

    ```
    For machine learning classifications on all glitches in O1, O2, O3a, and O3b, please see Gravity Spy Machine Learning Classifications on Zenodo

    For the most recently uploaded training set used in Gravity Spy machine learning algorithms, please see Gravity Spy Training Set on Zenodo.

    For detailed information on the training set used for the original Gravity Spy machine learning paper, please see Machine learning for Gravity Spy: Glitch classification and dataset on Zenodo. 

     
    more » « less
  2. This data set contains all classifications that the Gravity Spy Machine Learning model for LIGO glitches from the first three observing runs (O1, O2 and O3, where O3 is split into O3a and O3b). Gravity Spy classified all noise events identified by the Omicron trigger pipeline in which Omicron identified that the signal-to-noise ratio was above 7.5 and the peak frequency of the noise event was between 10 Hz and 2048 Hz. To classify noise events, Gravity Spy made Omega scans of every glitch consisting of 4 different durations, which helps capture the morphology of noise events that are both short and long in duration.

    There are 22 classes used for O1 and O2 data (including No_Glitch and None_of_the_Above), while there are two additional classes used to classify O3 data.

    For O1 and O2, the glitch classes were: 1080Lines, 1400Ripples, Air_Compressor, Blip, Chirp, Extremely_Loud, Helix, Koi_Fish, Light_Modulation, Low_Frequency_Burst, Low_Frequency_Lines, No_Glitch, None_of_the_Above, Paired_Doves, Power_Line, Repeating_Blips, Scattered_Light, Scratchy, Tomte, Violin_Mode, Wandering_Line, Whistle

    For O3, the glitch classes were: 1080Lines, 1400Ripples, Air_Compressor, Blip, Blip_Low_Frequency, Chirp, Extremely_Loud, Fast_Scattering, Helix, Koi_Fish, Light_Modulation, Low_Frequency_Burst, Low_Frequency_Lines, No_Glitch, None_of_the_Above, Paired_Doves, Power_Line, Repeating_Blips, Scattered_Light, Scratchy, Tomte, Violin_Mode, Wandering_Line, Whistle

    If you would like to download the Omega scans associated with each glitch, then you can use the gravitational-wave data-analysis tool GWpy. If you would like to use this tool, please install anaconda if you have not already and create a virtual environment using the following command

    ```conda create --name gravityspy-py38 -c conda-forge python=3.8 gwpy pandas psycopg2 sqlalchemy```

    After downloading one of the CSV files for a specific era and interferometer, please run the following Python script if you would like to download the data associated with the metadata in the CSV file. We recommend not trying to download too many images at one time. For example, the script below will read data on Hanford glitches from O2 that were classified by Gravity Spy and filter for only glitches that were labelled as Blips with 90% confidence or higher, and then download the first 4 rows of the filtered table.

    ```

    from gwpy.table import GravitySpyTable

    H1_O2 = GravitySpyTable.read('H1_O2.csv')

    H1_O2[(H1_O2["ml_label"] == "Blip") & (H1_O2["ml_confidence"] > 0.9)]

    H1_O2[0:4].download(nproc=1)

    ```

    Each of the columns in the CSV files are taken from various different inputs: 

    [‘event_time’, ‘ifo’, ‘peak_time’, ‘peak_time_ns’, ‘start_time’, ‘start_time_ns’, ‘duration’, ‘peak_frequency’, ‘central_freq’, ‘bandwidth’, ‘channel’, ‘amplitude’, ‘snr’, ‘q_value’] contain metadata about the signal from the Omicron pipeline. 

    [‘gravityspy_id’] is the unique identifier for each glitch in the dataset. 

    [‘1400Ripples’, ‘1080Lines’, ‘Air_Compressor’, ‘Blip’, ‘Chirp’, ‘Extremely_Loud’, ‘Helix’, ‘Koi_Fish’, ‘Light_Modulation’, ‘Low_Frequency_Burst’, ‘Low_Frequency_Lines’, ‘No_Glitch’, ‘None_of_the_Above’, ‘Paired_Doves’, ‘Power_Line’, ‘Repeating_Blips’, ‘Scattered_Light’, ‘Scratchy’, ‘Tomte’, ‘Violin_Mode’, ‘Wandering_Line’, ‘Whistle’] contain the machine learning confidence for a glitch being in a particular Gravity Spy class (the confidence in all these columns should sum to unity). 

    [‘ml_label’, ‘ml_confidence’] provide the machine-learning predicted label for each glitch, and the machine learning confidence in its classification. 

    [‘url1’, ‘url2’, ‘url3’, ‘url4’] are the links to the publicly-available Omega scans for each glitch. ‘url1’ shows the glitch for a duration of 0.5 seconds, ‘url2’ for 1 seconds, ‘url3’ for 2 seconds, and ‘url4’ for 4 seconds.

    ```

    For the most recently uploaded training set used in Gravity Spy machine learning algorithms, please see Gravity Spy Training Set on Zenodo.

    For detailed information on the training set used for the original Gravity Spy machine learning paper, please see Machine learning for Gravity Spy: Glitch classification and dataset on Zenodo. 

     
    more » « less