skip to main content


Title: Bayesian Decision Process for Budget-efficient Crowdsourced Clustering

The performance of clustering depends on an appropriately defined similarity between two items. When the similarity is measured based on human perception, human workers are often employed to estimate a similarity score between items in order to support clustering, leading to a procedure called crowdsourced clustering. Assuming a monetary reward is paid to a worker for each similarity score and assuming the similarities between pairs and workers' reliability have a large diversity, when the budget is limited, it is critical to wisely assign pairs of items to different workers to optimize the clustering result. We model this budget allocation problem as a Markov decision process where item pairs are dynamically assigned to workers based on the historical similarity scores they provided. We propose an optimistic knowledge gradient policy where the assignment of items in each stage is based on the minimum-weight K-cut defined on a similarity graph. We provide simulation studies and real data analysis to demonstrate the performance of the proposed method.

 
more » « less
Award ID(s):
1845444
NSF-PAR ID:
10203136
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
International Joint Conferences on Artificial Intelligence (IJCAI)
Page Range / eLocation ID:
2044 to 2050
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Crowdsourcing has been widely adopted to perform large projects suitable for human participation, in which tasks are usually distributed to workers. Many such projects involve classification/labeling certain collections of items through semisupervised clustering, in which queries on small subsets of the items are assigned to workers in the crowd. The answers are collected by a taskmaster and the goal is to fully recover the labels. This problem can be modeled as a sparsely encoded source coding problem, where each query answer, regarded as a code bit, is the XOR of a small number of labels, as source information bits. While the problem of designing compression/source coding schemes achieving Shannon’s optimal compression rate is very well-studied, a few have considered sparsely encoded schemes. In this paper we leverage the connections between this problem and well-studied codes with sparse representations for the channel coding problem to provide querying schemes with almost optimal number of queries, each of which involving only a constant number of labels. We also extend this scenario to the case where some workers can be unresponsive. For this case, we propose querying schemes where each query involves only log n items, where n is the total number of items to be labeled. Furthermore, we consider classification of two correlated labeling systems and provide two-stage querying schemes with almost optimal number of queries each involving a constant number of labels. 
    more » « less
  2. The ability to recognize and make inductive inferences based on relational similarity is fundamental to much of human higher cognition. However, relational similarity is not easily defined or measured, which makes it difficult to determine whether individual differences in cognitive capacity or semantic knowledge impact relational processing. In two experiments, we used a multi-arrangement task (previously applied to individual words or objects) to efficiently assess similarities between word pairs instantiating various abstract relations. Experiment 1 established that the method identifies word pairs expressing the same relation as more similar to each other than to those expressing different relations. Experiment 2 extended these results by showing that relational similarity measured by the multi-arrangement task is sensitive to more subtle distinctions. Word pairs instantiating the same specific subrelation were judged as more similar to each other than to those instantiating different subrelations within the same general relation type. In addition, Experiment 2 found that individual differences in both fluid intelligence and crystalized verbal intelligence correlated with differentiation of relation similarity judgments. 
    more » « less
  3. Defining the similarity between chemical entities is an essential task in polymer informatics, enabling ranking, clustering, and classification. Despite its importance, the pairwise chemical similarity of polymers remains an open problem. Here, a similarity function for polymers with well-defined backbones is designed based on polymers’ stochastic graph representations generated from canonical BigSMILES, a structurally based line notation for describing macromolecules. The stochastic graph representations are separated into three parts: repeat units, end groups, and polymer topology. The earth mover’s distance is utilized to calculate the similarity of the repeat units and end groups, while the graph edit distance is used to calculate the similarity of the topology. These three values can be linearly or nonlinearly combined to yield an overall pairwise chemical similarity score for polymers that is largely consistent with the chemical intuition of expert users and is adjustable based on the relative importance of different chemical features for a given similarity problem. This method gives a reliable solution to quantitatively calculate the pairwise chemical similarity score for polymers and represents a vital step toward building search engines and quantitative design tools for polymer data. 
    more » « less
  4. Clustering algorithms are often evaluated using metrics which compare with ground-truth cluster assignments, such as Rand index and NMI. Algorithm performance may vary widely for different hyperparameters, however, and thus model selection based on optimal performance for these metrics is discordant with how these algorithms are applied in practice, where labels are unavailable and tuning is often more art than science. It is therefore desirable to compare clustering algorithms not only on their optimally tuned performance, but also some notion of how realistic it would be to obtain this performance in practice. We propose an evaluation of clustering methods capturing this ease-of-tuning by modeling the expected best clustering score under a given computation budget. To encourage the adoption of the proposed metric alongside classic clustering evaluations, we provide an extensible benchmarking framework. We perform an extensive empirical evaluation of our proposed metric on popular clustering algorithms over a large collection of datasets from different domains, and observe that our new metric leads to several noteworthy observations. 
    more » « less
  5. Abstract

    As inspirational stimuli can assist designers with achieving enhanced design outcomes, supporting the retrieval of impactful sources of inspiration is important. Existing methods facilitating this retrieval have relied mostly on semantic relationships, e.g., analogical distances. Increasingly, data-driven methods can be leveraged to represent diverse stimuli in terms of multi-modal information, enabling designers to access stimuli in terms of less explored, non-text-based relationships. Toward improved retrieval of multi-modal representations of inspirational stimuli, this work compares human-evaluated and computationally derived similarities between stimuli in terms of non-text-based visual and functional features. A human subjects study (n = 36) was conducted where similarity assessments between triplets of 3D-model parts were collected and used to construct psychological embedding spaces. Distances between unique part embeddings were used to represent similarities in terms of visual and functional features. Obtained distances were compared with computed distances between embeddings of the same stimuli generated using artificial intelligence (AI)-based deep-learning approaches. When used to assess similarity in appearance and function, these representations were found to be largely consistent, with highest agreement found when assessing pairs of stimuli with low similarity. Alignment between models was otherwise lower when identifying the same pairs of stimuli with higher levels of similarity. Importantly, qualitative data also revealed insights regarding how humans made similarity assessments, including more abstract information not captured using AI-based approaches. Toward providing inspiration to designers that considers design problems, ideas, and solutions in terms of non-text-based relationships, further exploration of how these relationships are represented and evaluated is encouraged.

     
    more » « less