Abstract Wetland loss is increasing rapidly, and there are gaps in public awareness of the problem. By crowdsourcing image analysis of wetland morphology, academic and government studies could be supplemented and accelerated while engaging and educating the public. The Land Loss Lookout (LLL) project crowdsourced mapping of wetland morphology associated with wetland loss and restoration. We demonstrate that volunteers can be trained relatively easily online to identify characteristic wetland morphologies, or patterns present on the landscape that suggest a specific geomorphological process. Results from a case study in coastal Louisiana revealed strong agreement among nonexpert and expert assessments who agreed on classifications at least 83% and at most 94% of the time. Participants self‐reported increased knowledge of wetland loss after participating in the project. Crowd‐identified morphologies are consistent with expectations, although more work is needed to directly compare LLL results with previous studies. This work provides a foundation for using crowd‐based wetland loss analysis to increase public awareness of the issue, and to contribute to land surveys or train machine learning algorithms.
more »
« less
This content will become publicly available on April 25, 2026
Crowdsourced Think-Aloud Studies
The think-aloud (TA) protocol is a useful method for evaluating user interfaces, including data visualizations. However, TA studies are time-consuming to conduct and hence often have a small number of participants. Crowdsourcing TA studies would help alleviate these problems, but the technical overhead and the unknown quality of results have restricted TA to synchronous studies. To address this gap we introduce CrowdAloud, a system for creating and analyzing asynchronous, crowdsourced TA studies. CrowdAloud captures audio and provenance (log) data as participants interact with a stimulus. Participant audio is automatically transcribed and visualized together with events data and a full recreation of the state of the stimulus as seen by participants. To gauge the value of crowdsourced TA studies, we conducted two experiments: one to compare lab-based and crowdsourced TA studies, and one to compare crowdsourced TA studies with crowdsourced text prompts. Our results suggest that crowdsourcing is a viable approach for conducting TA studies at scale.
more »
« less
- Award ID(s):
- 2213756
- PAR ID:
- 10617848
- Publisher / Repository:
- ACM
- Date Published:
- ISBN:
- 9798400713941
- Page Range / eLocation ID:
- 1 to 23
- Format(s):
- Medium: X
- Location:
- Yokohama Japan
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Crowdsourcing is widely used to create data for common natural language understanding tasks. Despite the importance of these datasets for measuring and refining model understanding of language, there has been little focus on the crowdsourcing methods used for collecting the datasets. In this paper, we compare the efficacy of interventions that have been proposed in prior work as ways of improving data quality. We use multiple-choice question answering as a testbed and run a randomized trial by assigning crowdworkers to write questions under one of four different data collection protocols. We find that asking workers to write explanations for their examples is an ineffective stand-alone strategy for boosting NLU example difficulty. However, we find that training crowdworkers, and then using an iterative process of collecting data, sending feedback, and qualifying workers based on expert judgments is an effective means of collecting challenging data. But using crowdsourced, instead of expert judgments, to qualify workers and send feedback does not prove to be effective. We observe that the data from the iterative protocol with expert assessments is more challenging by several measures. Notably, the human--model gap on the unanimous agreement portion of this data is, on average, twice as large as the gap for the baseline protocol data.more » « less
-
This work investigates how different forms of input elicitation obtained from crowdsourcing can be utilized to improve the quality of inferred labels for image classification tasks, where an image must be labeled as either positive or negative depending on the presence/absence of a specified object. Five types of input elicitation methods are tested: binary classification (positive or negative); the ( x, y )-coordinate of the position participants believe a target object is located; level of confidence in binary response (on a scale from 0 to 100%); what participants believe the majority of the other participants' binary classification is; and participant's perceived difficulty level of the task (on a discrete scale). We design two crowdsourcing studies to test the performance of a variety of input elicitation methods and utilize data from over 300 participants. Various existing voting and machine learning (ML) methods are applied to make the best use of these inputs. In an effort to assess their performance on classification tasks of varying difficulty, a systematic synthetic image generation process is developed. Each generated image combines items from the MPEG-7 Core Experiment CE-Shape-1 Test Set into a single image using multiple parameters (e.g., density, transparency, etc.) and may or may not contain a target object. The difficulty of these images is validated by the performance of an automated image classification method. Experiment results suggest that more accurate results can be achieved with smaller training datasets when both the crowdsourced binary classification labels and the average of the self-reported confidence values in these labels are used as features for the ML classifiers. Moreover, when a relatively larger properly annotated dataset is available, in some cases augmenting these ML algorithms with the results (i.e., probability of outcome) from an automated classifier can achieve even higher performance than what can be obtained by using any one of the individual classifiers. Lastly, supplementary analysis of the collected data demonstrates that other performance metrics of interest, namely reduced false-negative rates, can be prioritized through special modifications of the proposed aggregation methods.more » « less
-
There are many factors that affect the quality of data received from crowdsourcing, including cognitive biases, varying levels of expertise, and varying subjective scales. This work investigates how the elicitation and integration of multiple modalities of input can enhance the quality of collective estimations. We create a crowdsourced experiment where participants are asked to estimate the number of dots within images in two ways: ordinal (ranking) and cardinal (numerical) estimates. We run our study with 300 participants and test how the efficiency of crowdsourced computation is affected when asking participants to provide ordinal and/or cardinal inputs and how the accuracy of the aggregated outcome is affected when using a variety of aggregation methods. First, we find that more accurate ordinal and cardinal estimations can be achieved by prompting participants to provide both cardinal and ordinal information. Second, we present how accurate collective numerical estimates can be achieved with significantly fewer people when aggregating individual preferences using optimization-based consensus aggregation models. Interestingly, we also find that aggregating cardinal information may yield more accurate ordinal estimates.more » « less
-
null (Ed.)There are many factors that affect the quality of data received from crowdsourcing, including cognitive biases, varying levels of expertise, and varying subjective scales. This work investigates how the elicitation and integration of multiple modalities of input can enhance the quality of collective estimations. We create a crowdsourced experiment where participants are asked to estimate the number of dots within images in two ways: ordinal (ranking) and cardinal (numerical) estimates. We run our study with 300 participants and test how the efficiency of crowdsourced computation is affected when asking participants to provide ordinal and/or cardinal inputs and how the accuracy of the aggregated outcome is affected when using a variety of aggregation methods. First, we find that more accurate ordinal and cardinal estimations can be achieved by prompting participants to provide both cardinal and ordinal information. Second, we present how accurate collective numerical estimates can be achieved with significantly fewer people when aggregating individual preferences using optimization-based consensus aggregation models. Interestingly, we also find that aggregating cardinal information may yield more accurate ordinal estimates.more » « less
An official website of the United States government
