NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Design Principles for Background Knowledge to Enhance Learning in Citizen Science

https://doi.org/10.1007/978-3-031-28032-0_43

Crowston, K.; Jackson, C.; Corieri, I.; Østerlund, C. (January 2023, Information for a Better World: Normality, Virtuality, Physicality, Inclusivity. iConference 2023. Lecture Notes in Computer Science, vol 13972)
Sserwanga, I. (Ed.)
Citizen scientists make valuable contributions to science but need to learn about the data they are working with to be able to perform more advanced tasks. We present a set of design principles for identifying the kinds of background knowledge that are important to support learning at different stages of engagement, drawn from a study of how free/libre open source software developers are guided to create and use documents. Specifically, we suggest that newcomers require help understanding the purpose, form and content of the documents they engage with, while more advanced developers add understanding of information provenance and the boundaries, relevant participants and work processes. We apply those principles in two separate but related studies. In study 1, we analyze the background knowledge presented to volunteers in the Gravity Spy citizen-science project, mapping the resources to the framework and identifying kinds of knowledge that were not initially provided. In study 2, we use the principles proactively to develop design suggestions for Gravity Spy 2.0, which will involve volunteers in analyzing more diverse sources of data. This new project extends the application of the principles by seeking to use them to support understanding of the relationships between documents, not just the documents individually. We conclude by discussing future work, including a planned evaluation of Gravity Spy 2.0 that will provide a further test of the design principles.
more » « less
Full Text Available
Gravity Spy Volunteer Classifications of LIGO Glitches from Observing Runs O1, O2, O3a, and O3b

https://doi.org/10.5281/zenodo.5911226

Zevin, Michael; Coughlin, Scott; Chase, Eve; Allen, Sara; Bahaadini, Sara; Berry, Christopher; Crowston, Kevin; Harandi, Mabi; Jackson, Corey; Kalogera, Vicky; et al (January 2022, Zenodo)

{"Abstract":["This dataset contains machine learning and volunteer classifications from the Gravity Spy project. It includes glitches from observing runs O1, O2, O3a and O3b that received at least one classification from a registered volunteer in the project. It also indicates glitches that are nominally retired from the project using our default set of retirement parameters, which are described below. See more details in the Gravity Spy Methods paper. <\/p>\n\nWhen a particular subject in a citizen science project (in this case, glitches from the LIGO datastream) is deemed to be classified sufficiently it is "retired" from the project. For the Gravity Spy project, retirement depends on a combination of both volunteer and machine learning classifications, and a number of parameterizations affect how quickly glitches get retired. For this dataset, we use a default set of retirement parameters, the most important of which are: <\/p>\n\nA glitches must be classified by at least 2 registered volunteers<\/li>Based on both the initial machine learning classification and volunteer classifications, the glitch has more than a 90% probability of residing in a particular class<\/li>Each volunteer classification (weighted by that volunteer's confusion matrix) contains a weight equal to the initial machine learning score when determining the final probability<\/li><\/ol>\n\nThe choice of these and other parameterization will affect the accuracy of the retired dataset as well as the number of glitches that are retired, and will be explored in detail in an upcoming publication (Zevin et al. in prep). <\/p>\n\nThe dataset can be read in using e.g. Pandas: \n```\nimport pandas as pd\ndataset = pd.read_hdf('retired_fulldata_min2_max50_ret0p9.hdf5', key='image_db')\n```\nEach row in the dataframe contains information about a particular glitch in the Gravity Spy dataset. <\/p>\n\nDescription of series in dataframe<\/strong><\/p>\n\n['1080Lines', '1400Ripples', 'Air_Compressor', 'Blip', 'Chirp', 'Extremely_Loud', 'Helix', 'Koi_Fish', 'Light_Modulation', 'Low_Frequency_Burst', 'Low_Frequency_Lines', 'No_Glitch', 'None_of_the_Above', 'Paired_Doves', 'Power_Line', 'Repeating_Blips', 'Scattered_Light', 'Scratchy', 'Tomte', 'Violin_Mode', 'Wandering_Line', 'Whistle']\n\tMachine learning scores for each glitch class in the trained model, which for a particular glitch will sum to unity<\/li><\/ul>\n\t<\/li>['ml_confidence', 'ml_label']\n\tHighest machine learning confidence score across all classes for a particular glitch, and the class associated with this score<\/li><\/ul>\n\t<\/li>['gravityspy_id', 'id']\n\tUnique identified for each glitch on the Zooniverse platform ('gravityspy_id') and in the Gravity Spy project ('id'), which can be used to link a particular glitch to the full Gravity Spy dataset (which contains GPS times among many other descriptors)<\/li><\/ul>\n\t<\/li>['retired']\n\tMarks whether the glitch is retired using our default set of retirement parameters (1=retired, 0=not retired)<\/li><\/ul>\n\t<\/li>['Nclassifications']\n\tThe total number of classifications performed by registered volunteers on this glitch<\/li><\/ul>\n\t<\/li>['final_score', 'final_label']\n\tThe final score (weighted combination of machine learning and volunteer classifications) and the most probable type of glitch<\/li><\/ul>\n\t<\/li>['tracks']\n\tArray of classification weights that were added to each glitch category due to each volunteer's classification<\/li><\/ul>\n\t<\/li><\/ul>\n\n <\/p>\n\n```\nFor machine learning classifications on all glitches in O1, O2, O3a, and O3b, please see Gravity Spy Machine Learning Classifications on Zenodo<\/p>\n\nFor the most recently uploaded training set used in Gravity Spy machine learning algorithms, please see Gravity Spy Training Set on Zenodo.<\/p>\n\nFor detailed information on the training set used for the original Gravity Spy machine learning paper, please see Machine learning for Gravity Spy: Glitch classification and dataset on Zenodo. <\/p>"]}
more » « less
Discovering features in gravitational-wave data through detector characterization, citizen science and machine learning

https://doi.org/10.1088/1361-6382/ac1ccb

Soni, S; Berry, C P; Coughlin, S B; Harandi, M; Jackson, C B; Crowston, K; Østerlund, C; Patane, O; Katsaggelos, A K; Trouille, L; et al (September 2021, Classical and Quantum Gravity)
null (Ed.)
Full Text Available
Gravity Spy Machine Learning Classifications of LIGO Glitches from Observing Runs O1, O2, O3a, and O3b

https://doi.org/10.5281/zenodo.5649211

Coughlin, Scott; Zevin, Michael; Bahaadini, Sara; Rohani, Neda; Allen, Sara; Berry, Christopher; Crowston, Kevin; Harandi, Mabi; Jackson, Corey; Kalogera, Vicky; et al (January 2021, Zenodo)

{"Abstract":["This data set contains all classifications that the Gravity Spy Machine Learning model for LIGO glitches from the first three observing runs (O1, O2 and O3, where O3 is split into O3a and O3b). Gravity Spy classified all noise events identified by the Omicron trigger pipeline in which Omicron identified that the signal-to-noise ratio was above 7.5 and the peak frequency of the noise event was between 10 Hz and 2048 Hz. To classify noise events, Gravity Spy made Omega scans of every glitch consisting of 4 different durations, which helps capture the morphology of noise events that are both short and long in duration.<\/p>\n\nThere are 22 classes used for O1 and O2 data (including No_Glitch and None_of_the_Above), while there are two additional classes used to classify O3 data.<\/p>\n\nFor O1 and O2, the glitch classes were: 1080Lines, 1400Ripples, Air_Compressor, Blip, Chirp, Extremely_Loud, Helix, Koi_Fish, Light_Modulation, Low_Frequency_Burst, Low_Frequency_Lines, No_Glitch, None_of_the_Above, Paired_Doves, Power_Line, Repeating_Blips, Scattered_Light, Scratchy, Tomte, Violin_Mode, Wandering_Line, Whistle<\/p>\n\nFor O3, the glitch classes were: 1080Lines, 1400Ripples, Air_Compressor, Blip, Blip_Low_Frequency<\/strong>, Chirp, Extremely_Loud, Fast_Scattering<\/strong>, Helix, Koi_Fish, Light_Modulation, Low_Frequency_Burst, Low_Frequency_Lines, No_Glitch, None_of_the_Above, Paired_Doves, Power_Line, Repeating_Blips, Scattered_Light, Scratchy, Tomte, Violin_Mode, Wandering_Line, Whistle<\/p>\n\nIf you would like to download the Omega scans associated with each glitch, then you can use the gravitational-wave data-analysis tool GWpy. If you would like to use this tool, please install anaconda if you have not already and create a virtual environment using the following command<\/p>\n\n```conda create --name gravityspy-py38 -c conda-forge python=3.8 gwpy pandas psycopg2 sqlalchemy```<\/p>\n\nAfter downloading one of the CSV files for a specific era and interferometer, please run the following Python script if you would like to download the data associated with the metadata in the CSV file. We recommend not trying to download too many images at one time. For example, the script below will read data on Hanford glitches from O2 that were classified by Gravity Spy and filter for only glitches that were labelled as Blips with 90% confidence or higher, and then download the first 4 rows of the filtered table.<\/p>\n\n```<\/p>\n\nfrom gwpy.table import GravitySpyTable<\/p>\n\nH1_O2 = GravitySpyTable.read('H1_O2.csv')<\/p>\n\nH1_O2[(H1_O2["ml_label"] == "Blip") & (H1_O2["ml_confidence"] > 0.9)]<\/p>\n\nH1_O2[0:4].download(nproc=1)<\/p>\n\n```<\/p>\n\nEach of the columns in the CSV files are taken from various different inputs: <\/p>\n\n[\u2018event_time\u2019, \u2018ifo\u2019, \u2018peak_time\u2019, \u2018peak_time_ns\u2019, \u2018start_time\u2019, \u2018start_time_ns\u2019, \u2018duration\u2019, \u2018peak_frequency\u2019, \u2018central_freq\u2019, \u2018bandwidth\u2019, \u2018channel\u2019, \u2018amplitude\u2019, \u2018snr\u2019, \u2018q_value\u2019] contain metadata about the signal from the Omicron pipeline. <\/p>\n\n[\u2018gravityspy_id\u2019] is the unique identifier for each glitch in the dataset. <\/p>\n\n[\u20181400Ripples\u2019, \u20181080Lines\u2019, \u2018Air_Compressor\u2019, \u2018Blip\u2019, \u2018Chirp\u2019, \u2018Extremely_Loud\u2019, \u2018Helix\u2019, \u2018Koi_Fish\u2019, \u2018Light_Modulation\u2019, \u2018Low_Frequency_Burst\u2019, \u2018Low_Frequency_Lines\u2019, \u2018No_Glitch\u2019, \u2018None_of_the_Above\u2019, \u2018Paired_Doves\u2019, \u2018Power_Line\u2019, \u2018Repeating_Blips\u2019, \u2018Scattered_Light\u2019, \u2018Scratchy\u2019, \u2018Tomte\u2019, \u2018Violin_Mode\u2019, \u2018Wandering_Line\u2019, \u2018Whistle\u2019] contain the machine learning confidence for a glitch being in a particular Gravity Spy class (the confidence in all these columns should sum to unity). <\/p>\n\n[\u2018ml_label\u2019, \u2018ml_confidence\u2019] provide the machine-learning predicted label for each glitch, and the machine learning confidence in its classification. <\/p>\n\n[\u2018url1\u2019, \u2018url2\u2019, \u2018url3\u2019, \u2018url4\u2019] are the links to the publicly-available Omega scans for each glitch. \u2018url1\u2019 shows the glitch for a duration of 0.5 seconds, \u2018url2\u2019 for 1 seconds, \u2018url3\u2019 for 2 seconds, and \u2018url4\u2019 for 4 seconds.<\/p>\n\n```<\/p>\n\nFor the most recently uploaded training set used in Gravity Spy machine learning algorithms, please see Gravity Spy Training Set on Zenodo.<\/p>\n\nFor detailed information on the training set used for the original Gravity Spy machine learning paper, please see Machine learning for Gravity Spy: Glitch classification and dataset on Zenodo. <\/p>"]}
more » « less
Building an Apparatus: Refractive, Reflective, and Diffractive Readings of Trace Data

https://doi.org/10.17705/1jais.00590

Østerlund, Carsten; Crowston, Kevin; Jackson, Corey (January 2020, Journal of the Association for Information Systems)
null (Ed.)
Full Text Available
Blending machine and human learning processes

Crowston, K.; Østerlund, C.; Lee, T. Kyoung (January 2017, Proceedings of the Annual Hawaii International Conference on System Sciences)

Citizen science projects face a dilemma in relying on contributions from volunteers to achieve their scientific goals: providing volunteers with explicit training might increase the quality of contributions, but at the cost of losing the work done by newcomers during the training period, which for many is the only work they will contribute to the project. Based on research in cognitive science on how humans learn to classify images, we have designed an approach to use machine learning to guide the presentation of tasks to newcomers that help them more quickly learn how to do the image classification task while still contributing to the work of the project. A Bayesian model for tracking volunteer learning is presented.
more » « less
Full Text Available
Recruiting Messages Matter: Message Strategies to Attract Citizen Scientists

https://doi.org/10.1145/3022198.3026335

Lee, Tae Kyoung; Crowston, Kevin; Østerlund, Carsten; Miller, Grant (January 2017, Companion of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing)

Although participation of citizen scientists is critical for a success of citizen science projects (a distinctive form of crowdsourcing), little attention has been paid to what types of messages can effectively recruit citizen scientists. Derived from previous studies on citizen scientists’ motivations, we created and sent participants one of four recruiting messages for a new project, Gravity Spy, appealing to different motivations (i.e., learning about science, social proof, contribution to science, and altruism). Counter to earlier studies on motivation, our results showed that messages appealing to learning, contribution and social proof were more effective than a message appealing to altruism. We discuss the inconsistency between the present and prior study results and plans for future work.
more » « less
Full Text Available
Gravity Spy: Humans, Machines and The Future of Citizen Science

https://doi.org/10.1145/3022198.3026329

Crowston, Kevin (January 2017, Companion of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing)

Gravity Spy is a citizen science project that draws on the contributions of both humans and machines to achieve its scientific goals. The system supports the Laser Interferometer Gravitational Observatory (LIGO) by classifying “glitches” that interfere with observations. The system makes three advances on the current state of the art: explicit training for new volunteers, synergy between machine and human classification and support for discovery of new classes of glitch. As well, it provides a platform for human-centred computing research on motivation, learning and collaboration. The system has been launched and is currently in operation.
more » « less
Full Text Available
"Guess what! You're the First to See this Event": Increasing Contribution to Online Production Communities

https://doi.org/10.1145/2957276.2957284

Jackson, Corey Brian; Crowston, Kevin; Mugar, Gabriel; Østerlund, Carsten (January 2016, GROUP '16: Proceedings of the 19th International Conference on Supporting Group Work)

In this paper, we describe the results of an online field experiment examining the impacts of messaging about task novelty on the volume of volunteers’ contributions to an online citizen science project. Encouraging volunteers to provide a little more content as they work is an attractive strategy to increase the community’s output. Prior research found that an important motivation for participation in online citizen science is the wonder of being the first person to observe a particular image. To appeal to this motivation, a pop-up message was added to an online citizen science project that alerted volunteers when they were the first to annotate a particular image. Our analysis reveals that new volunteers who saw these messages increased the volume of annotations they contributed. The results of our study suggest an additional strategy to increase the amount of work volunteers contribute to online communities and citizen science projects specifically.
more » « less
Full Text Available

Search for: All records