There are many factors that affect the quality of data received from crowdsourcing, including cognitive biases, varying levels of expertise, and varying subjective scales. This work investigates how the elicitation and integration of multiple modalities of input can enhance the quality of collective estimations. We create a crowdsourced experiment where participants are asked to estimate the number of dots within images in two ways: ordinal (ranking) and cardinal (numerical) estimates. We run our study with 300 participants and test how the efficiency of crowdsourced computation is affected when asking participants to provide ordinal and/or cardinal inputs and how the accuracy of the aggregated outcome is affected when using a variety of aggregation methods. First, we find that more accurate ordinal and cardinal estimations can be achieved by prompting participants to provide both cardinal and ordinal information. Second, we present how accurate collective numerical estimates can be achieved with significantly fewer people when aggregating individual preferences using optimization-based consensus aggregation models. Interestingly, we also find that aggregating cardinal information may yield more accurate ordinal estimates.
more »
« less
Elicitation and aggregation of multimodal estimates improve wisdom of crowd effects on ordering tasks
Abstract We present a wisdom of crowds study where participants are asked to order a small set of images based on the number of dots they contain and then to guess the respective number of dots in each image. We test two input elicitation interfaces—one elicits the two modalities of estimates jointly and the other independently. We show that the latter interface yields higher quality estimates, even though the multimodal estimates tend to be more self-contradictory. The inputs are aggregated via optimization and voting-rule based methods to estimate the true ordering of a larger universal set of images. We demonstrate that the quality of collective estimates from the simpler yet more computationally-efficient voting methods is comparable to that achieved by the more complex optimization model. Lastly, we find that using multiple modalities of estimates from one group yields better collective estimates compared to mixing numerical estimates from one group with the ordinal estimates from a different group.
more »
« less
- Award ID(s):
- 1850355
- PAR ID:
- 10491068
- Publisher / Repository:
- Springer Nature
- Date Published:
- Journal Name:
- Scientific Reports
- Volume:
- 14
- Issue:
- 1
- ISSN:
- 2045-2322
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)There are many factors that affect the quality of data received from crowdsourcing, including cognitive biases, varying levels of expertise, and varying subjective scales. This work investigates how the elicitation and integration of multiple modalities of input can enhance the quality of collective estimations. We create a crowdsourced experiment where participants are asked to estimate the number of dots within images in two ways: ordinal (ranking) and cardinal (numerical) estimates. We run our study with 300 participants and test how the efficiency of crowdsourced computation is affected when asking participants to provide ordinal and/or cardinal inputs and how the accuracy of the aggregated outcome is affected when using a variety of aggregation methods. First, we find that more accurate ordinal and cardinal estimations can be achieved by prompting participants to provide both cardinal and ordinal information. Second, we present how accurate collective numerical estimates can be achieved with significantly fewer people when aggregating individual preferences using optimization-based consensus aggregation models. Interestingly, we also find that aggregating cardinal information may yield more accurate ordinal estimates.more » « less
-
null (Ed.)Scientists use a mathematical subject called topology to study the shapes of objects. An important part of topology is counting the number of pieces and the number of holes in an object, and researchers use this information to group objects into different types. For example, a doughnut has the same number of holes and the same number of pieces as a teacup with one handle, but it is different from a ball. In studies that resemble activities like “connect-the-dots,” scientists use ideas from topology to study the “shape” of data. Ideas and methods from topology have been used to study the branching structures of veins in leaves, voting in elections, flight patterns in models of bird flocking, and more.more » « less
-
Voting is used widely to identify a collective decision for a group of agents, based on their preferences. In this paper, we focus on evaluating and designing voting rules that support both the privacy of the voting agents and a notion of fairness over such agents. To do this, we introduce a novel notion of group fairness and adopt the existing notion of local differential privacy. We then evaluate the level of group fairness in several existing voting rules, as well as the trade-offs between fairness and privacy, showing that it is not possible to always obtain maximal economic efficiency with high fairness or high privacy levels. Then, we present both a machine learning and a constrained optimization approach to design new voting rules that are fair while maintaining a high level of economic efficiency. Finally, we empirically examine the effect of adding noise to create local differentially private voting rules and discuss the three-way trade-off between economic efficiency, fairness, and privacy.This paper appears in the special track on AI & Society.more » « less
-
Label noise in real-world datasets encodes wrong correlation patterns and impairs the generalization of deep neural networks (DNNs). It is critical to find efficient ways to detect corrupted patterns. Current methods primarily focus on designing robust training techniques to prevent DNNs from memorizing corrupted patterns. These approaches often require customized training processes and may overfit corrupted patterns, leading to a performance drop in detection. In this paper, from a more data-centric perspective, we propose a training-free solution to detect corrupted labels. Intuitively, ``closer'' instances are more likely to share the same clean label. Based on the neighborhood information, we propose two methods: the first one uses ``local voting" via checking the noisy label consensuses of nearby features. The second one is a ranking-based approach that scores each instance and filters out a guaranteed number of instances that are likely to be corrupted. We theoretically analyze how the quality of features affects the local voting and provide guidelines for tuning neighborhood size. We also prove the worst-case error bound for the ranking-based method. Experiments with both synthetic and real-world label noise demonstrate our training-free solutions consistently and significantly improve most of the training-based baselines.more » « less
An official website of the United States government

