As machine learning methods become more powerful and capture more nuances of human behavior, biases in the dataset can shape what the model learns and is evaluated on. This paper explores and attempts to quantify the uncertainties and biases due to annotator demographics when creating sentiment analysis datasets. We ask >1000 crowdworkers to provide their demographic information and annotations for multimodal sentiment data and its component modalities. We show that demographic differences among annotators impute a significant effect on their ratings, and that these effects also occur in each component modality. We compare predictions of different state-of-the-art multimodal machine learning algorithms against annotations provided by different demographic groups, and find that changing annotator demographics can cause >4.5 in accuracy difference when determining positive versus negative sentiment. Our findings underscore the importance of accounting for crowdworker attributes, such as demographics, when building datasets, evaluating algorithms, and interpreting results for sentiment analysis.
more »
« less
When Do Annotator Demographics Matter? Measuring the Influence of Annotator Demographics with the POPQUORN Dataset
- Award ID(s):
- 2143529
- PAR ID:
- 10447129
- Date Published:
- Journal Name:
- Proceedings of the 17th Linguistic Annotation Workshop (LAW-XVII)
- Page Range / eLocation ID:
- 252 to 265
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Krueger, Dirk (Ed.)Abstract We argue theoretically and document empirically that aging leads to greater (industrial) automation, because it creates a shortage of middle-aged workers specializing in manual production tasks. We show that demographic change is associated with greater adoption of robots and other automation technologies across countries and with more robotics-related activities across U.S. commuting zones. We also document more automation innovation in countries undergoing faster aging. Our directed technological change model predicts that the response of automation technologies to aging should be more pronounced in industries that rely more on middle-aged workers and those that present greater opportunities for automation and that productivity should improve and the labor share should decline relatively in industries that are more amenable to automation. The evidence supports all four of these predictions.more » « less
-
Precise and eloquent label information is fundamental for interpreting the underlying data distributions distinctively and training of supervised and semi-supervised learning models adequately. But obtaining large amount of labeled data demands substantial manual effort. This obligation can be mitigated by acquiring labels of most informative data instances using Active Learning. However labels received from humans are not always reliable and poses the risk of introducing noisy class labels which will degrade the efficacy of a model instead of its improvement. In this paper, we address the problem of annotating sensor data instances of various Activities of Daily Living (ADLs) in smart home context. We exploit the interactions between the users and annotators in terms of relationships spanning across spatial and temporal space which accounts for an activity as well. We propose a novel annotator selection model SocialAnnotator which exploits the interactions between the users and annotators and rank the annotators based on their level of correspondence. We also introduce a novel approach to measure this correspondence distance using the spatial and temporal information of interactions, type of the relationships and activities. We validate our proposed SocialAnnotator framework in smart environments achieving ≈ 84% statistical confidence in data annotationmore » « less
-
Machine learning models are bounded by the credibility of ground truth data used for both training and testing. Regardless of the problem domain, this ground truth annotation is objectively manual and tedious as it needs considerable amount of human intervention. With the advent of Active Learning with multiple annotators, the burden can be somewhat mitigated by actively acquiring labels of most informative data instances. However, multiple annotators with varying degrees of expertise poses new set of challenges in terms of quality of the label received and availability of the annotator. Due to limited amount of ground truth information addressing the variabilities of Activity of Daily Living (ADLs), activity recognition models using wearable and mobile devices are still not robust enough for real-world deployment. In this paper, we propose an active learning combined deep model which updates its network parameters based on the optimization of a joint loss function. We then propose a novel annotator selection model by exploiting the relationships among the users while considering their heterogeneity with respect to their expertise, physical and spatial context. Our proposed model leverages model-free deep reinforcement learning in a partially observable environment setting to capture the actionreward interaction among multiple annotators. Our experiments in real-world settings exhibit that our active deep model converges to optimal accuracy with fewer labeled instances and achieves 8% improvement in accuracy in fewer iterations.more » « less
An official website of the United States government

