skip to main content


Title: Blending machine and human learning processes
Citizen science projects face a dilemma in relying on contributions from volunteers to achieve their scientific goals: providing volunteers with explicit training might increase the quality of contributions, but at the cost of losing the work done by newcomers during the training period, which for many is the only work they will contribute to the project. Based on research in cognitive science on how humans learn to classify images, we have designed an approach to use machine learning to guide the presentation of tasks to newcomers that help them more quickly learn how to do the image classification task while still contributing to the work of the project. A Bayesian model for tracking volunteer learning is presented.  more » « less
Award ID(s):
1547880
NSF-PAR ID:
10026452
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of the Annual Hawaii International Conference on System Sciences
ISSN:
1530-1605
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Sserwanga, I. (Ed.)
    Citizen scientists make valuable contributions to science but need to learn about the data they are working with to be able to perform more advanced tasks. We present a set of design principles for identifying the kinds of background knowledge that are important to support learning at different stages of engagement, drawn from a study of how free/libre open source software developers are guided to create and use documents. Specifically, we suggest that newcomers require help understanding the purpose, form and content of the documents they engage with, while more advanced developers add understanding of information provenance and the boundaries, relevant participants and work processes. We apply those principles in two separate but related studies. In study 1, we analyze the background knowledge presented to volunteers in the Gravity Spy citizen-science project, mapping the resources to the framework and identifying kinds of knowledge that were not initially provided. In study 2, we use the principles proactively to develop design suggestions for Gravity Spy 2.0, which will involve volunteers in analyzing more diverse sources of data. This new project extends the application of the principles by seeking to use them to support understanding of the relationships between documents, not just the documents individually. We conclude by discussing future work, including a planned evaluation of Gravity Spy 2.0 that will provide a further test of the design principles. 
    more » « less
  2. Abstract

    The Gravity Spy project aims to uncover the origins of glitches, transient bursts of noise that hamper analysis of gravitational-wave data. By using both the work of citizen-science volunteers and machine learning algorithms, the Gravity Spy project enables reliable classification of glitches. Citizen science and machine learning are intrinsically coupled within the Gravity Spy framework, with machine learning classifications providing a rapid first-pass classification of the dataset and enabling tiered volunteer training, and volunteer-based classifications verifying the machine classifications, bolstering the machine learning training set and identifying new morphological classes of glitches. These classifications are now routinely used in studies characterizing the performance of the LIGO gravitational-wave detectors. Providing the volunteers with a training framework that teaches them to classify a wide range of glitches, as well as additional tools to aid their investigations of interesting glitches, empowers them to make discoveries of new classes of glitches. This demonstrates that, when giving suitable support, volunteers can go beyond simple classification tasks to identify new features in data at a level comparable to domain experts. The Gravity Spy project is now providing volunteers with more complicated data that includes auxiliary monitors of the detector to identify the root cause of glitches.

     
    more » « less
  3. This dataset contains machine learning and volunteer classifications from the Gravity Spy project. It includes glitches from observing runs O1, O2, O3a and O3b that received at least one classification from a registered volunteer in the project. It also indicates glitches that are nominally retired from the project using our default set of retirement parameters, which are described below. See more details in the Gravity Spy Methods paper. 

    When a particular subject in a citizen science project (in this case, glitches from the LIGO datastream) is deemed to be classified sufficiently it is "retired" from the project. For the Gravity Spy project, retirement depends on a combination of both volunteer and machine learning classifications, and a number of parameterizations affect how quickly glitches get retired. For this dataset, we use a default set of retirement parameters, the most important of which are: 

    1. A glitches must be classified by at least 2 registered volunteers
    2. Based on both the initial machine learning classification and volunteer classifications, the glitch has more than a 90% probability of residing in a particular class
    3. Each volunteer classification (weighted by that volunteer's confusion matrix) contains a weight equal to the initial machine learning score when determining the final probability

    The choice of these and other parameterization will affect the accuracy of the retired dataset as well as the number of glitches that are retired, and will be explored in detail in an upcoming publication (Zevin et al. in prep). 

    The dataset can be read in using e.g. Pandas: 
    ```
    import pandas as pd
    dataset = pd.read_hdf('retired_fulldata_min2_max50_ret0p9.hdf5', key='image_db')
    ```
    Each row in the dataframe contains information about a particular glitch in the Gravity Spy dataset. 

    Description of series in dataframe

    • ['1080Lines', '1400Ripples', 'Air_Compressor', 'Blip', 'Chirp', 'Extremely_Loud', 'Helix', 'Koi_Fish', 'Light_Modulation', 'Low_Frequency_Burst', 'Low_Frequency_Lines', 'No_Glitch', 'None_of_the_Above', 'Paired_Doves', 'Power_Line', 'Repeating_Blips', 'Scattered_Light', 'Scratchy', 'Tomte', 'Violin_Mode', 'Wandering_Line', 'Whistle']
      • Machine learning scores for each glitch class in the trained model, which for a particular glitch will sum to unity
    • ['ml_confidence', 'ml_label']
      • Highest machine learning confidence score across all classes for a particular glitch, and the class associated with this score
    • ['gravityspy_id', 'id']
      • Unique identified for each glitch on the Zooniverse platform ('gravityspy_id') and in the Gravity Spy project ('id'), which can be used to link a particular glitch to the full Gravity Spy dataset (which contains GPS times among many other descriptors)
    • ['retired']
      • Marks whether the glitch is retired using our default set of retirement parameters (1=retired, 0=not retired)
    • ['Nclassifications']
      • The total number of classifications performed by registered volunteers on this glitch
    • ['final_score', 'final_label']
      • The final score (weighted combination of machine learning and volunteer classifications) and the most probable type of glitch
    • ['tracks']
      • Array of classification weights that were added to each glitch category due to each volunteer's classification

     

    ```
    For machine learning classifications on all glitches in O1, O2, O3a, and O3b, please see Gravity Spy Machine Learning Classifications on Zenodo

    For the most recently uploaded training set used in Gravity Spy machine learning algorithms, please see Gravity Spy Training Set on Zenodo.

    For detailed information on the training set used for the original Gravity Spy machine learning paper, please see Machine learning for Gravity Spy: Glitch classification and dataset on Zenodo. 

     
    more » « less
  4. Gravity Spy is a citizen science project that draws on the contributions of both humans and machines to achieve its scientific goals. The system supports the Laser Interferometer Gravitational Observatory (LIGO) by classifying “glitches” that interfere with observations. The system makes three advances on the current state of the art: explicit training for new volunteers, synergy between machine and human classification and support for discovery of new classes of glitch. As well, it provides a platform for human-centred computing research on motivation, learning and collaboration. The system has been launched and is currently in operation. 
    more » « less
  5. This project will contribute to the national need for well-educated scientists, mathematicians, engineers, and technicians by supporting the retention and graduation of high-achieving, low-income students with demonstrated financial need at Minnesota State University, Mankato. Over its six year duration, this project will fund scholarships to 120 unique full-time students who are pursuing Bachelor of Science degrees in engineering. First semester junior, primarily transfer, students at Iron Range Engineering will receive scholarships for one semester. The Iron Range Engineering (IRE) STEM Scholars Program provides a financially sustainable pathway for students across the nation to graduate with an engineering degree and up to two years of industry experience. Students typically complete their first two years of engineering coursework at community colleges across the country. Students then join IRE and spend one transitional semester gaining training and experience to equip them with the technical, design, and professional skills needed to succeed in the engineering workforce. During the last two years of their education, IRE students work in industry, earning an engineering intern salary, while being supported in their technical and professional development by professors, learning facilitators, and their own peers. The IRE STEM Scholars project will provide access to a financially responsible engineering degree for low-income students by financially supporting them during the transitional semester, which has two financial challenges: university tuition costs are higher than their previous community college costs, and the semester occurs before they are able to earn an engineering co-op income. In addition, the project will provide personalized mentorship throughout students’ pathway to graduation, such as weekly conversations with a mentor. By providing these supports, the IRE STEM Scholars project aims to prepare students to be competitive applicants for the engineering workforce with career development and engineering co-op experience. Because community colleges draw relatively representative proportions of students from a variety of backgrounds, this project has the potential to learn how transfer pathways and co-op education can support financially sustainable pathways to engineering degrees for a more diverse group of students and contribute to the development of a diverse, competitive engineering workforce. The overall goal of this project is to increase STEM degree completion of low-income, high-achieving undergraduates with demonstrated financial need. As part of the scope of this project, a concurrent mixed-methods research study will be done on engineering students’ thriving, specifically their identity, belonging, motivation, and overall wellbeing (or mental and physical health). Student outcomes have previously been measured primarily through academic markers such as graduation rates and GPA. In addition to these outcomes, this project explores ways to better support overall student thriving. This study will address the following research questions: How do undergraduate students’ engineering identity and belongingness develop over time in a co-op-based engineering program? How do undergraduate students’ motivation and identity connect to overall wellbeing in a co-op-based engineering program? In the first year of the IRE STEM Scholars Project, initial interview data describe scholars’ sense of belonging in engineering, prior to their first co-op experiences and survey data describe IRE students’ experiences in co-op and overall sense of belonging. Future work will utilize these values to identify ways to better support the IRE STEM scholars’ identity development as they move into their first co-op experiences. This project is funded by NSF’s Scholarships in Science, Technology, Engineering, and Mathematics program, which seeks to increase the number of low-income academically talented students with demonstrated financial need who earn degrees in STEM fields. It also aims to improve the education of future STEM workers, and to generate knowledge about academic success, retention, transfer, graduation, and academic/career pathways of low-income students. 
    more » « less