"Better Be Computer or I'm Dumb": A Large-Scale Evaluation of Humans as Audio Deepfake Detectors

Warren, Kevin; Tucker, Tyler; Crowder, Anna; Olszewski, Daniel; Lu, Allison; Fedele, Caroline; Pasternak, Magdalena; Layton, Seth; Butler, Kevin; Gates, Carrie; Traynor, Patrick

doi:10.1145/3658644.3670325

Citation Details

"Better Be Computer or I'm Dumb": A Large-Scale Evaluation of Humans as Audio Deepfake Detectors

Audio deepfakes represent a rising threat to trust in our daily communications. In response to this, the research community has developed a wide array of detection techniques aimed at preventing such attacks from deceiving users. Unfortunately, the creation of these defenses has generally overlooked the most important element of the system - the user themselves. As such, it is not clear whether current mechanisms augment, hinder, or simply contradict human classification of deepfakes. In this paper, we perform the first large-scale user study on deepfake detection. We recruit over 1,200 users and present them with samples from the three most widely-cited deepfake datasets. We then quantitatively compare performance and qualitatively conduct thematic analysis to motivate and understand the reasoning behind user decisions and differences from machine classifications. Our results show that users correctly classify human audio at significantly higher rates than machine learning models, and rely on linguistic features and intuition when performing classification. However, users are also regularly misled by pre-conceptions about the capabilities of generated audio (e.g., that accents and background sounds are indicative of humans). Finally, machine learning models suffer from significantly higher false positive rates, and experience false negatives that humans correctly classify when issues of quality or robotic characteristics are reported. By analyzing user behavior across multiple deepfake datasets, our study demonstrates the need to more tightly compare user and machine learning performance, and to target the latter towards areas where humans are less likely to successfully identify threats. more »

Award ID(s):: 1933208 2206950 2205171

PAR ID:: 10616851

Author(s) / Creator(s):: Warren, Kevin; Tucker, Tyler; Crowder, Anna; Olszewski, Daniel; Lu, Allison; Fedele, Caroline; Pasternak, Magdalena; Layton, Seth; Butler, Kevin; Gates, Carrie; Traynor, Patrick

Publisher / Repository:: ACM

Date Published:: 2024-12-02

ISBN:: 9798400706363

Page Range / eLocation ID:: 2696 to 2710

Format(s):: Medium: X

Location:: Salt Lake City UT USA

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1145/3658644.3670325

More Like this