We explore the bi-directional relationship between human and machine learning in citizen science. Theoretically, the study draws on the zone of proximal development (ZPD) concept, which allows us to describe AI augmentation of human learning, human augmentation of machine learning, and how tasks can be designed to facilitate co-learning. The study takes a design-science approach to explore the design, deployment, and evaluations of the Gravity Spy citizen science project. The findings highlight the challenges and opportunities of co-learning, where both humans and machines contribute to each other’s learning and capabilities. The study takes its point of departure in the literature on co-learning and develops a framework for designing projects where humans and machines mutually enhance each other’s learning. The research contributes to the existing literature by developing a dynamic approach to human-AI augmentation, by emphasizing that the ZPD supports ongoing learning for volunteers and keeps machine learning aligned with evolving data. The approach offers potential benefits for project scalability, participant engagement, and automation considerations while acknowledging the importance of tutorials, community access, and expert involvement in supporting learning.
more »
« less
Citizen science frontiers: Efficiency, engagement, and serendipitous discovery with human–machine systems
Citizen science has proved to be a unique and effective tool in helping science and society cope with the ever-growing data rates and volumes that characterize the modern research landscape. It also serves a critical role in engaging the public with research in a direct, authentic fashion and by doing so promotes a better understanding of the processes of science. To take full advantage of the onslaught of data being experienced across the disciplines, it is essential that citizen science platforms leverage the complementary strengths of humans and machines. ThisPerspectivespiece explores the issues encountered in designing human–machine systems optimized for both efficiency and volunteer engagement, while striving to safeguard and encourage opportunities for serendipitous discovery. We discuss case studies from Zooniverse, a large online citizen science platform, and show that combining human and machine classifications can efficiently produce results superior to those of either one alone and how smart task allocation can lead to further efficiencies in the system. While these examples make clear the promise of human–machine integration within an online citizen science system, we then explore in detail how system design choices can inadvertently lower volunteer engagement, create exclusionary practices, and reduce opportunity for serendipitous discovery. Throughout we investigate the tensions that arise when designing a human–machine system serving the dual goals of carrying out research in the most efficient manner possible while empowering a broad community to authentically engage in this research.
more »
« less
- PAR ID:
- 10084954
- Publisher / Repository:
- Proceedings of the National Academy of Sciences
- Date Published:
- Journal Name:
- Proceedings of the National Academy of Sciences
- Volume:
- 116
- Issue:
- 6
- ISSN:
- 0027-8424
- Page Range / eLocation ID:
- p. 1902-1909
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Citizen science and artificial intelligence (AI) complement each other by harnessing the strengths of both human and machine capabilities. Citizen science generates terabytes of raw numerical, text, and image data, the analysis of which requires automated techniques to process in an efficient manner. Conversely, AI computer vision technology can require tens of thousands of images during the training process, and citizen science projects are well suited to provide large libraries of data. Herein, we describe how AI tools are being applied across the GLOBE Observer citizen science data ecosystem, where image recognition algorithms are supporting data ingest processes, protecting user privacy and improving data fidelity. GLOBE citizen science data has been used to develop automated data classification routines that enable information discovery of mosquito larvae and land cover labels. These advances position GLOBE citizen scientist data for discovery and use in environmental and health research, as well as by machine learning scientists working in the general field of GeoAI.more » « less
-
Abstract The bulk of research on citizen science participants is project centric, based on an assumption that volunteers experience a single project. Contrary to this assumption, survey responses (n = 3894) and digital trace data (n = 3649) from volunteers, who collectively engaged in 1126 unique projects, revealed that multiproject participation was the norm. Only 23% of volunteers were singletons (who participated in only one project). The remaining multiproject participants were split evenly between discipline specialists (39%) and discipline spanners (38% joined projects with different disciplinary topics) and unevenly between mode specialists (52%) and mode spanners (25% participated in online and offline projects). Public engagement was narrow: The multiproject participants were eight times more likely to be White and five times more likely to hold advanced degrees than the general population. We propose a volunteer-centric framework that explores how the dynamic accumulation of experiences in a project ecosystem can support broad learning objectives and inclusive citizen science.more » « less
-
We explore patterns of interaction with different learning resources (e.g., forums) to predict learning outcomes in an online citizen science project called Gravity Spy. To explore how volunteers engage with and benefit from these resources, we categorize them based on Sørensen's three forms of presence in learning environments: authority-subject, agent-centered, and communal presence. Methodologically, we apply sequence analysis to traces of volunteer interactions with the project to identify engagement patterns with these resources that predict learning. Our interpretation of these patterns is augmented by insights gleaned from interviews with volunteers about their work and use of learning resources. We find that early in the project, volunteers have only a simple task to learn, and completing that task is most predictive of their learning. At more advanced levels, when tasks become more complex, discussions with other volunteers become increasingly important, and interaction patterns become more varied. Viewing learning as a series of routines allows us to articulate precisely how and in what context learning occurs. We conclude by discussing the implications of these findings for designing citizen science projects that promote learning.more » « less
-
Fortson, Lucy; Crowston, Kevin; Kloetzer, Laure; Ponti, Marisa (Ed.)Citizen science has become a valuable and reliable method for interpreting and processing big datasets, and is vital in the era of ever-growing data volumes. However, there are inherent difficulties in the generating labels from citizen scientists, due to the inherent variability between the members of the crowd, leading to variability in the results. Sometimes, this is useful — such as with serendipitous discoveries, which corresponds to rare/unknown classes in the data — but it might also be due to ambiguity between classes. The primary issue is then to distinguish between the intrinsic variability in the dataset and the uncertainty in the citizen scientists’ responses, and leveraging that to extract scientifically useful relationships. In this paper, we explore using a neural network to interpret volunteer confusion across the dataset, to increase the purity of the downstream analysis. We focus on the use of learned features from the network to disentangle feature similarity across the classes, and the ability of the machines’ “attention” in identifying features that lead to confusion. We use data from Jovian Vortex Hunter, a citizen science project to study vortices in Jupiter’s atmosphere, and find that the latent space from the model helps effectively identify different sources of image-level features that lead to low volunteer consensus. Furthermore, the machine’s attention highlights features corresponding to specific classes. This provides meaningful image-level feature-class relationships, which is useful in our analysis for identifying vortex-specific features to better understand vortex evolution mechanisms. Finally, we discuss the applicability of this method to other citizen science projects.more » « less
An official website of the United States government
