Mobile and web apps are increasingly relying on the data generated or provided by users such as from their uploaded documents and images. Unfortunately, those apps may raise significant user privacy concerns. Specifically, to train or adapt their models for accurately processing huge amounts of data continuously collected from millions of app users, app or service providers have widely adopted the approach of crowdsourcing for recruiting crowd workers to manually annotate or transcribe the sampled ever-changing user data. However, when users' data are uploaded through apps and then become widely accessible to hundreds of thousands of anonymous crowd workers, many human-in-the-loop related privacy questions arise concerning both the app user community and the crowd worker community. In this paper, we propose to investigate the privacy risks brought by this significant trend of large-scale crowd-powered processing of app users' data generated in their daily activities. We consider the representative case of receipt scanning apps that have millions of users, and focus on the corresponding receipt transcription tasks that appear popularly on crowdsourcing platforms. We design and conduct an app user survey study (n=108) to explore how app users perceive privacy in the context of using receipt scanning apps. We also design and conduct a crowd worker survey study (n=102) to explore crowd workers' experiences on receipt and other types of transcription tasks as well as their attitudes towards such tasks. Overall, we found that most app users and crowd workers expressed strong concerns about the potential privacy risks to receipt owners, and they also had a very high level of agreement with the need for protecting receipt owners' privacy. Our work provides insights on app users' potential privacy risks in crowdsourcing, and highlights the need and challenges for protecting third party users' privacy on crowdsourcing platforms. We have responsibly disclosed our findings to the related crowdsourcing platform and app providers. 
                        more » 
                        « less   
                    
                            
                            Crowdsourcing as a Tool for Research: Methodological, Fair, and Political Considerations
                        
                    
    
            Crowdsourcing platforms are powerful tools for academic researchers. Proponents claim that crowdsourcing helps researchers quickly and affordably recruit enough human subjects with diverse backgrounds to generate significant statistical power, while critics raise concerns about unreliable data quality, labor exploitation, and unequal power dynamics between researchers and workers. We examine these concerns along three dimensions: methods, fairness, and politics. We find that researchers offer vastly different compensation rates for crowdsourced tasks, and address potential concerns about data validity by using platform-specific tools and user verification methods. Additionally, workers depend upon crowdsourcing platforms for a significant portion of their income, are motivated more by fear of losing access to work than by specific compensation rates, and are frustrated by a lack of transparency and occasional unfair treatment from job requesters. Finally, we discuss critical computing scholars’ proposals to address crowdsourcing’s problems, challenges with implementing these resolutions, and potential avenues for future research. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1936968
- PAR ID:
- 10286990
- Date Published:
- Journal Name:
- Bulletin of Science, Technology & Society
- Volume:
- 40
- Issue:
- 3-4
- ISSN:
- 0270-4676
- Page Range / eLocation ID:
- 40 to 53
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Crowdsourcing has become a popular means to solicit assistance for scientific research. From classifying images or texts to responding to surveys, tapping into the knowledge of crowds to complete complex tasks has become a common strategy in social and information sciences. Although the timeliness and cost-effectiveness of crowdsourcing may provide desirable advantages to researchers, the data it generates may be of lower quality for some scientific purposes. The quality control mechanisms, if any, offered by common crowdsourcing platforms may not provide robust measures of data quality. This study explores whether research task participants may engage in motivated misreporting whereby participants tend to cut corners to reduce their workload while performing various scientific tasks online. We conducted an experiment with three common crowdsourcing tasks: answering surveys, coding images, and classifying online social media content. The experiment recruited workers from three sources: a crowdsourcing platform (Amazon Mechanical Turk) and a commercial online survey panel. The analysis seeks to address the following two questions: (1) whether online panelists or crowd workers may engage in motivated misreporting differently and (2) whether the patterns of misreporting vary by different task types. The study focuses on the analysis of the experiment in answering surveys and offers quality assurance practice guideline of using crowdsourcing for social science research.more » « less
- 
            As independently contracted employees, gig workers disproportionately suffer the consequences of workplace surveillance, which include increased pressures to work, breaches of privacy, and decreased digital autonomy. Despite the negative impacts of workplace surveillance, gig workers lack the tools, strategies, and workplace social support to protect themselves against these harms. Meanwhile, some critical theorists have proposed sousveillance as a potential means of countering such abuses of power, whereby those under surveillance monitor those in positions of authority (e.g., gig workers collect data about requesters or platforms). To understand the benefits of sousveillance systems in the gig economy, we conducted semi-structured interviews and led co-design activities with gig workers. We use care ethics as a guiding concept to understand our interview and co-design data, while also focusing on empathic sousveillance technology design recommendations. Through our study, we identify gig workers' attitudes towards and past experiences with sousveillance. We also uncover the types of sousveillance technologies imagined by workers, provide design recommendations, and finish by discussing how to create empowering, empathic spaces on gig platforms.more » « less
- 
            In recent years, gig work platforms have gained popularity as a way for individuals to earn money; as of 2021, 16% of Americans have at some point earned money from such platforms. Despite their popularity and their history of unfair data collection practices and worker safety, little is known about the data collected from workers (and users) by gig platforms and about the privacy dark pattern designs present in their apps. This paper presents an empirical measurement of 16 gig work platforms' data practices in the U.S. We analyze what data is collected by these platforms, and how it is shared and used. Finally, we consider how these practices constitute privacy dark patterns. To that end, we develop a novel combination of methods to address gig-worker-specific challenges in experimentation and data collection, enabling the largest in-depth study of such platforms to date. We find extensive data collection and sharing with 60 third parties—including sharing reversible hashes of worker Social Security Numbers (SSNs)—along with dark patterns that subject workers to greater privacy risk and opportunistically use collected data to nag workers in off-platform messages. We conclude this paper with proposed interdisciplinary mitigations for improving gig worker privacy protections. After we disclosed our SSN-related findings to affected platforms, the platforms confirmed that the issue had been mitigated. This is consistent with our independent audit of the affected platforms. Analysis code and redacted datasets will be made available to those who wish to reproduce our findings.more » « less
- 
            Flexibility is essential for optimizing crowdworker performance in the digital labor market, and prior research shows that integrating diverse devices can enhance this flexibility. While studies on Amazon Mechanical Turk show the need for tailored workflows and varied device usage and preferences, it remains unclear if these insights apply to other platforms. To explore this, we conducted a survey on another major crowdsourcing platform, Prolific, involving 1,000 workers. Our findings reveal that desktops are still the primary devices for crowdwork, but Prolific workers display more diverse usage patterns and a greater interest in adopting smartwatches, smart speakers, and tablets compared to MTurk workers. While current use of these newer devices is limited, there is growing interest in employing them for future tasks. These results underscore the importance for crowdsourcing platforms to develop platform-specific strategies that promote more flexible and engaging workflows, better aligning with the diverse needs of their crowdworkers.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    