skip to main content


Search for: All records

Award ID contains: 1850355

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract

    We present a wisdom of crowds study where participants are asked to order a small set of images based on the number of dots they contain and then to guess the respective number of dots in each image. We test two input elicitation interfaces—one elicits the two modalities of estimates jointly and the other independently. We show that the latter interface yields higher quality estimates, even though the multimodal estimates tend to be more self-contradictory. The inputs are aggregated via optimization and voting-rule based methods to estimate the true ordering of a larger universal set of images. We demonstrate that the quality of collective estimates from the simpler yet more computationally-efficient voting methods is comparable to that achieved by the more complex optimization model. Lastly, we find that using multiple modalities of estimates from one group yields better collective estimates compared to mixing numerical estimates from one group with the ordinal estimates from a different group.

     
    more » « less
    Free, publicly-accessible full text available February 1, 2025
  2. Ordering polytopes have been instrumental to the study of combinatorial optimization problems arising in a variety of fields including comparative probability, computational social choice, and group decision-making. The weak order polytope is defined as the convex hull of the characteristic vectors of all binary orders on n alternatives that are reflexive, transitive, and total. By and large, facet defining inequalities (FDIs) of this polytope have been obtained through simple enumeration and through connections with other combinatorial polytopes. This paper derives five new large classes of FDIs by utilizing the equivalent representations of a weak order as a ranking of n alternatives that allows ties; this connection simplifies the construction of valid inequalities, and it enables groupings of characteristic vectors into useful structures. We demonstrate that a number of FDIs previously obtained through enumeration are actually special cases of the large classes. This work also introduces novel construction procedures for generating affinely independent members of the identified ranking structures. Additionally, it states two conjectures on how to derive many more large classes of FDIs using the featured techniques. 
    more » « less
    Free, publicly-accessible full text available September 27, 2024
  3. Nguyen, Ngoc T ; Botzheim, János ; Gulyás, László ; Nunez, Manuel ; Treur, Jan ; Vossen, Gottfried ; Kozierkiewicz, Adrianna" (Ed.)
    The stock market is affected by a seemingly infinite number of factors, making it highly uncertain yet impactful. A large determinant of stock performance is public sentiment, which can often be volatile. To integrate human inputs in a more structured and effective manner, this study explores a combination of the wisdom of crowds concept and machine learning (ML) for stock price prediction. A crowdsourcing study is developed to test three ways to elicit stock predictions from the crowd. The study also assesses the impact of priming participants with estimates provided by a Long Short Term Model (LSTM) model herein developed for this context. 
    more » « less
    Free, publicly-accessible full text available September 22, 2024
  4. Free, publicly-accessible full text available September 1, 2024
  5. Free, publicly-accessible full text available May 1, 2024
  6. This work investigates how different forms of input elicitation obtained from crowdsourcing can be utilized to improve the quality of inferred labels for image classification tasks, where an image must be labeled as either positive or negative depending on the presence/absence of a specified object. Five types of input elicitation methods are tested: binary classification (positive or negative); the ( x, y )-coordinate of the position participants believe a target object is located; level of confidence in binary response (on a scale from 0 to 100%); what participants believe the majority of the other participants' binary classification is; and participant's perceived difficulty level of the task (on a discrete scale). We design two crowdsourcing studies to test the performance of a variety of input elicitation methods and utilize data from over 300 participants. Various existing voting and machine learning (ML) methods are applied to make the best use of these inputs. In an effort to assess their performance on classification tasks of varying difficulty, a systematic synthetic image generation process is developed. Each generated image combines items from the MPEG-7 Core Experiment CE-Shape-1 Test Set into a single image using multiple parameters (e.g., density, transparency, etc.) and may or may not contain a target object. The difficulty of these images is validated by the performance of an automated image classification method. Experiment results suggest that more accurate results can be achieved with smaller training datasets when both the crowdsourced binary classification labels and the average of the self-reported confidence values in these labels are used as features for the ML classifiers. Moreover, when a relatively larger properly annotated dataset is available, in some cases augmenting these ML algorithms with the results (i.e., probability of outcome) from an automated classifier can achieve even higher performance than what can be obtained by using any one of the individual classifiers. Lastly, supplementary analysis of the collected data demonstrates that other performance metrics of interest, namely reduced false-negative rates, can be prioritized through special modifications of the proposed aggregation methods. 
    more » « less
  7. Top-k lists are being increasingly utilized in various fields and applications including information retrieval, machine learning, and recommendation systems. Since multiple top-k lists may be generated by different algorithms to evaluate the same set of entities or system of interest, there is often a need to consolidate this collection of heterogeneous top-k lists to obtain a more robust and coherent list. This work introduces various exact mathematical formulations of the top-k list aggregation problem under the generalized Kendall tau distance. Furthermore, the strength of the proposed formulations is analyzed from a polyhedral point of view. 
    more » « less
  8. Rank aggregation has many applications in computer science, operations research, and group decision-making. This paper introduces lower bounds on the Kemeny aggregation problem when the input rankings are non-strict (with and without ties). It generalizes some of the existing lower bounds for strict rankings to the case of non-strict rankings, and it proposes shortcuts for reducing the run time of these techniques. More specifically, we use Condorcet criterion variations and the Branch & Cut method to accelerate the lower bounding process. 
    more » « less
  9. Kamar, Ece ; Luther, Kurt (Ed.)
    This study investigates how different forms of input elicitation obtained from crowdsourcing can be utilized to improve the quality of inferred labels for image classification tasks, where an image must be labeled as either positive or negative depending on the presence/absence of a specified object. Three types of input elicitation methods are tested: binary classification (positive or negative); level of confidence in binary response (on a scale from 0-100%); and what participants believe the majority of the other participants’ binary classification is. We design a crowdsourcing experiment to test the performance of the proposed input elicitation methods and use data from over 200 participants. Various existing voting and machine learning (ML) methods are applied and others developed to make the best use of these inputs. In an effort to assess their performance on classification tasks of varying difficulty, a systematic synthetic image generation process is developed. Each generated image combines items from the MPEG-7 Core Experiment CE-Shape-1 Test Set into a single image using multiple parameters (e.g., density, transparency, etc.) and may or may not contain a target object. The difficulty of these images is validated by the performance of an automated image classification method. Experimental results suggest that more accurate classifications can be achieved when using the average of the self-reported confidence values as an additional attribute for ML algorithms relative to what is achieved with more traditional approaches. Additionally, they demonstrate that other performance metrics of interest, namely reduced false-negative rates, can be prioritized through special modifications of the proposed aggregation methods that leverage the variety of elicited inputs. 
    more » « less