Different techniques have been recommended to detect fraudulent responses in online surveys, but little research has been taken to systematically test the extent to which they actually work in practice. In this paper, we conduct an empirical evaluation of 22 antifraud tests in two complementary online surveys. The first survey recruits Rust programmers on public online forums and social media networks. We find that fraudulent respondents involve both bot and human characteristics. Among different anti-fraud tests, those designed based on domain knowledge are the most effective. By combining individual tests, we can achieve a detection performance as good as commercial techniques while making the results more explainable. To explore these tests under a broader context, we ran a different survey on Amazon Mechanical Turk (MTurk). The results show that for a generic survey without requiring users to have any domain knowledge, it is more difficult to distinguish fraudulent responses. However, a subset of tests still remain effective.
more »
« less
When Research Becomes All About the Bots: A Case Study on Fraud Prevention and Participant Validation in the Context of Abortion Storytelling
Effective fraud prevention and participant validation are essential for ensuring data quality in today's highly-digitized research landscape. Increasingly sophisticated bots and high levels of fraudulent participants have generated a need for more complex and nuanced methods to combat fraudulent activity. In this paper, we share our experiences with fraudulent survey responses, which we encountered in our work around abortion storytelling, and the multi-stage protocol that we developed to validate participants. We found that effective fraud prevention should start early and include a variety of flagging methods to encourage holistic pattern-searching in data. Researchers should overestimate the amount of time they will need to validate participants and consider asking participants to assist in the validation process. We encourage researchers to be transparent about the interpretive nature of this work. To this end, we contribute a Participant Validation Guide in supplemental materials for community members to adapt in their own practices.
more »
« less
- Award ID(s):
- 1852294
- PAR ID:
- 10521629
- Publisher / Repository:
- ACM
- Date Published:
- ISBN:
- 9798400703317
- Page Range / eLocation ID:
- 1 to 8
- Subject(s) / Keyword(s):
- fraud fraud prevention digital survey recruitment asynchronous remote community
- Format(s):
- Medium: X
- Location:
- Honolulu HI USA
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Online merchants face difficulties in using existing card fraud detection algorithms, so in this paper we propose a novel proactive fraud detection model using what we call invariant diversity to reveal patterns among attributes of the devices (computers or smartphones) that are used in conducting the transactions. The model generates a regression function from a diversity index of various attribute combinations, and use it to detect anomalies inherent in certain fraudulent transactions. This approach allows for proactive fraud detection using a relatively small number of unsupervised transactions and is resistant to fraudsters' device obfuscation attempt. We tested our system successfully on real online merchant transactions and it managed to find several instances of previously undetected fraudulent transactions.more » « less
-
DOS-GNN: Dual-Feature Aggregations with Over-Sampling for Class-Imbalanced Fraud Detection On GraphsAs fraudulent activities have shot up manifolds, fraud detection has emerged as a pivotal process in different fields (e.g., e-commerce, online reviews, and social networks). Since interactions among entities provide valuable insights into fraudulent activities, such behaviors can be naturally represented as graph structures, where graph neural networks (GNNs) have been developed as prominent models to boost the efficacy of fraud detection. In graph-based fraud detection, handling imbalanced datasets poses a significant challenge, as the minority class often gets overshadowed, diminishing the performance of conventional GNNs. While oversampling has recently been adapted for imbalanced graphs, it contends with issues such as graph heterophily and noisy edge synthesis. To address these limitations, this paper introduces DOS-GNN, incorporating Dual-feature aggregation with Over-Sampling to advance GNNs for class-imbalanced fraud detection on graphs. This model exploits feature separation and dual-feature aggregation to mitigate the impact of heterophily and acquire refined node embeddings that facilitate fraud oversampling to balance class distribution without the need for edge synthesis. Extensive experiments on four large and real-world fraud datasets demonstrate that DOS-GNN can significantly improve fraud detection performance on graphs with different imbalance ratios and homophily ratios, outperforming state-of-the-art GNN models.more » « less
-
Black Hat App Search Optimization (ASO) in the form of fake reviews and sockpuppet accounts, is prevalent in peer-opinion sites, e.g., app stores, with negative implications on the digital and real lives of their users. To detect and filter fraud, a growing body of research has provided insights into various aspects of fraud posting activities, and made assumptions about the working procedures of the fraudsters from online data. However, such assumptions often lack empirical evidence from the actual fraud perpetrators. To address this problem, in this paper, we present results of both a qualitative study with 18 ASO workers we recruited from 5 freelancing sites, concerning activities they performed on Google Play, and a quantitative investigation with fraud-related data collected from other 39 ASO workers. We reveal findings concerning various aspects of ASO worker capabilities and behaviors, including novel insights into their working patterns, and supporting evidence for several existing assumptions. Further, we found and report participant-revealed techniques to bypass Google-imposed verifications, concrete strategies to avoid detection, and even strategies that leverage fraud detection to enhance fraud efficacy. We report a Google site vulnerability that enabled us to infer the mobile device models used to post more than 198 million reviews in Google Play, including 9,942 fake reviews. We discuss the deeper implications of our findings, including their potential use to develop the next generation fraud detection and prevention systems.more » « less
-
Fraud detection is of great importance because fraudulent behaviors may mislead consumers or bring huge losses to enterprises. Due to the lockstep feature of fraudulent behaviors, fraud detection problem can be viewed as finding suspicious dense blocks in the attributed bipartite graph. In reality, existing attribute-based methods are not adversarially robust, because fraudsters can take some camouflage actions to cover their behavior attributes as normal. More importantly, existing structural information based methods only consider shallow topology structure, making their effectiveness sensitive to the density of suspicious blocks. In this paper, we propose a novel deep structure learning model named DeepFD to differentiate normal users and suspicious users. DeepFD can preserve the non-linear graph structure and user behavior information simultaneously. Experimental results on different types of datasets demonstrate that DeepFD outperforms the state-of-the-art baselines.more » « less
An official website of the United States government

