Effective fraud prevention and participant validation are essential for ensuring data quality in today's highly-digitized research landscape. Increasingly sophisticated bots and high levels of fraudulent participants have generated a need for more complex and nuanced methods to combat fraudulent activity. In this paper, we share our experiences with fraudulent survey responses, which we encountered in our work around abortion storytelling, and the multi-stage protocol that we developed to validate participants. We found that effective fraud prevention should start early and include a variety of flagging methods to encourage holistic pattern-searching in data. Researchers should overestimate the amount of time they will need to validate participants and consider asking participants to assist in the validation process. We encourage researchers to be transparent about the interpretive nature of this work. To this end, we contribute a Participant Validation Guide in supplemental materials for community members to adapt in their own practices.
more »
« less
Beyond Bot Detection: Combating Fraudulent Online Survey Takers
Different techniques have been recommended to detect fraudulent responses in online surveys, but little research has been taken to systematically test the extent to which they actually work in practice. In this paper, we conduct an empirical evaluation of 22 antifraud tests in two complementary online surveys. The first survey recruits Rust programmers on public online forums and social media networks. We find that fraudulent respondents involve both bot and human characteristics. Among different anti-fraud tests, those designed based on domain knowledge are the most effective. By combining individual tests, we can achieve a detection performance as good as commercial techniques while making the results more explainable. To explore these tests under a broader context, we ran a different survey on Amazon Mechanical Turk (MTurk). The results show that for a generic survey without requiring users to have any domain knowledge, it is more difficult to distinguish fraudulent responses. However, a subset of tests still remain effective.
more »
« less
- PAR ID:
- 10321051
- Date Published:
- Journal Name:
- Proceedings of the Web Conference 2022
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Privacy and security researchers often rely on data collected through online crowdsourcing platforms such as Amazon Mechanical Turk (MTurk) and Prolific. Prior work---which used data collected in the United States between 2013 and 2017---found that MTurk responses regarding security and privacy were generally representative for people under 50 or with some college education. However, the landscape of online crowdsourcing has changed significantly over the last five years, with the rise of Prolific as a major platform and the increasing presence of bots. This work attempts to replicate the prior results about the external validity of online privacy and security surveys. We conduct an online survey on MTurk (n=800), a gender-balanced survey on Prolific (n=800), and a representative survey on Prolific (n=800) and compare the responses to a probabilistic survey conducted by the Pew Research Center (n=4272). We find that MTurk response quality has degraded over the last five years, and our results do not replicate the earlier finding about the generalizability of MTurk responses. By contrast, we find that data collected through Prolific is generally representative for questions about user perceptions and experiences, but not for questions about security and privacy knowledge. We also evaluate the impact of Prolific settings, attention check questions, and statistical methods on the external validity of online surveys, and we develop recommendations about best practices for conducting online privacy and security surveys.more » « less
-
The landscapes of many elementary, middle, and high school math classrooms have undergone major transformations over the last half-century, moving from drill-and-skill work to more conceptual reasoning and hands-on manipulative work. However, if you look at a college level calculus class you are likely to find the main difference is the professor now has a whiteboard marker in hand rather than a piece of chalk. It is possible that some student work may be done on the computer, but much of it contains the same type of repetitive skill building problems. This should seem strange given the advancements in technology that allow more freedom than ever to build connections between different representations of a concept. Several class activities have been developed using a combination of approaches, depending on the topic. Topics covered in the activities include Riemann Sums, Accumulation, Center of Mass, Volumes of Revolution (Discs, Washers, and Shells), and Volumes of Similar Cross-section. All activities use student note outlines that are either done in a whole group interactive-lecture approach, or in a group work inquiry-based approach. Some of the activities use interactive graphs designed on desmos.com and others use physical models that have been designed in OpenSCAD and 3D-printed for students to use in class. Tactile objects were developed because they should provide an advantage to students by enabling them to physically interact with the concepts being taught, deepening their involvement with the material, and providing more stimuli for the brain to encode the learning experience. Web-based activities were developed because the topics involved needed substantial changes in graphical representations (i.e. limits with Riemann Sums). Assessment techniques for each topic include online homework, exams, and online concept questions with an explanation response area. These concept questions are intended to measure students’ ability to use multiple representations in order to answer the question, and are not generally computational in nature. Students are also given surveys to rate the overall activities as well as finer grained survey questions to try and elicit student thoughts on certain aspects of the models, websites, and activity sheets. We will report on student responses to the activity surveys, looking for common themes in students’ thoughts toward specific attributes of the activities. We will also compare relevant exam question responses and online concept question results, including common themes present or absent in student reasoning.more » « less
-
null (Ed.)Outreach and communication with the public have substantial value in polar research, in which studies often find changes of global importance that are happening far out of sight from the majority of people living at lower latitudes. Seeking evidence on the effectiveness of outreach programs, the U.S. National Science Foundation sponsored large-scale survey assessments before and after the International Polar Year in 2007/2008. Polar-knowledge questions have subsequently been tested and refined through other nationwide and regional surveys. More than a decade of such work has established that basic but fairly specific knowledge questions, with all answer choices sounding plausible but one being uniquely correct, can yield highly replicable results. Those results, however, paint a mixed picture of knowledge. Some factual questions seem to be interpreted by many respondents as if they had been asked for their personal beliefs about climate change, so their responses reflect sociopolitical identity rather than physical-world knowledge. Other factual questions, by design, do not link in obvious ways to climate-change beliefs—so responses have simpler interpretations in terms of knowledge gaps, and education needs.more » « less
-
With the rapid growth of online learning at community colleges and the low course completion and performance associated with it, there has been increasing need to identify effective ways to address the challenges in online teaching and learning at this setting. Based on open-ended survey responses from 105 instructors and 365 students from multiple community colleges in a state, this study examined instructors’ and students’ perceptions of effective and ineffective instructional practices and changes needed in online coursework. By combining structural topic modelling techniques with human coding, we identified instructional practices that were perceived by both instructors and students as effective in supporting online learning as well as ineffective and needing improvement. Moreover, we identified a handful of misalignments between instructors and students in their perceptions of online teaching, including course workload and effective ways to communicate.more » « less
An official website of the United States government

