A deepfake is content or material that is synthetically generated or manipulated using artificial intelligence (AI) methods, to be passed off as real and can include audio, video, image, and text synthesis. The key difference between manual editing and deepfakes is that deepfakes are AI generated or AI manipulated and closely resemble authentic artifacts. In some cases, deepfakes can be fabricated using AI-generated content in its entirety. Deepfakes have started to have a major impact on society with more generation mechanisms emerging everyday. This article makes a contribution in understanding the landscape of deepfakes, and their detection and generation methods. We evaluate various categories of deepfakes especially in audio. The purpose of this survey is to provide readers with a deeper understanding of (1) different deepfake categories; (2) how they could be created and detected; (3) more specifically, how audio deepfakes are created and detected in more detail, which is the main focus of this paper. We found that generative adversarial networks (GANs), convolutional neural networks (CNNs), and deep neural networks (DNNs) are common ways of creating and detecting deepfakes. In our evaluation of over 150 methods, we found that the majority of the focus is on video deepfakes, and, in particular, the generation of video deepfakes. We found that for text deepfakes, there are more generation methods but very few robust methods for detection, including fake news detection, which has become a controversial area of research because of the potential heavy overlaps with human generation of fake content. Our study reveals a clear need to research audio deepfakes and particularly detection of audio deepfakes. This survey has been conducted with a different perspective, compared to existing survey papers that mostly focus on just video and image deepfakes. This survey mainly focuses on audio deepfakes that are overlooked in most of the existing surveys. This article's most important contribution is to critically analyze and provide a unique source of audio deepfake research, mostly ranging from 2016 to 2021. To the best of our knowledge, this is the first survey focusing on audio deepfakes generation and detection in English.
more »
« less
Dungeons & Deepfakes: Using scenario-based role-play to study journalists' behavior towards using AI-based verification tools for video content
The evolving landscape of manipulated media, including the threat of deepfakes, has made information verification a daunting challenge for journalists. Technologists have developed tools to detect deepfakes, but these tools can sometimes yield inaccurate results, raising concerns about inadvertently disseminating manipulated content as authentic news. This study examines the impact of unreliable deepfake detection tools on information verification. We conducted role-playing exercises with 24 US journalists, immersing them in complex breaking-news scenarios where determining authenticity was challenging. Through these exercises, we explored questions regarding journalists’ investigative processes, use of a deepfake detection tool, and decisions on when and what to publish. Our findings reveal that journalists are diligent in verifying information, but sometimes rely too heavily on results from deepfake detection tools. We argue for more cautious release of such tools, accompanied by proper training for users to mitigate the risk of unintentionally propagating manipulated content as real news.
more »
« less
- Award ID(s):
- 2040209
- PAR ID:
- 10543750
- Publisher / Repository:
- ACM
- Date Published:
- ISBN:
- 9798400703300
- Page Range / eLocation ID:
- 1 to 17
- Format(s):
- Medium: X
- Location:
- Honolulu HI USA
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Easy access to audio-visual content on social media, combined with the availability of modern tools such as Tensorflow or Keras, and open-source trained models, along with economical computing infrastructure, and the rapid evolution of deep-learning (DL) methods have heralded a new and frightening trend. Particularly, the advent of easily available and ready to use Generative Adversarial Networks (GANs), have made it possible to generate deepfakes media partially or completely fabricated with the intent to deceive to disseminate disinformation and revenge porn, to perpetrate financial frauds and other hoaxes, and to disrupt government functioning. Existing surveys have mainly focused on the detection of deepfake images and videos; this paper provides a comprehensive review and detailed analysis of existing tools and machine learning (ML) based approaches for deepfake generation, and the methodologies used to detect such manipulations in both audio and video. For each category of deepfake, we discuss information related to manipulation approaches, current public datasets, and key standards for the evaluation of the performance of deepfake detection techniques, along with their results. Additionally, we also discuss open challenges and enumerate future directions to guide researchers on issues which need to be considered in order to improve the domains of both deepfake generation and detection. This work is expected to assist readers in understanding how deepfakes are created and detected, along with their current limitations and where future research may lead.more » « less
-
By employing generative deep learning techniques, Deepfakes are created with the intent to create mistrust in society, manipulate public opinion and political decisions, and for other malicious purposes such as blackmail, scamming, and even cyberstalking. As realistic deepfake may involve manipulation of either audio or video or both, thus it is important to explore the possibility of detecting deepfakes through the inadequacy of generative algorithms to synchronize audio and visual modalities. Prevailing performant methods, either detect audio or video cues for deepfakes detection while few ensemble the results after predictions on both modalities without inspecting relationship between audio and video cues. Deepfake detection using joint audiovisual representation learning is not explored much. Therefore, this paper proposes a unified multimodal framework, Multimodaltrace, which extracts learned channels from audio and visual modalities, mixes them independently in IntrAmodality Mixer Layer (IAML), processes them jointly in IntErModality Mixer Layers (IEML) from where it is fed to multilabel classification head. Empirical results show the effectiveness of the proposed framework giving state-of-the-art accuracy of 92.9% on the FakeAVCeleb dataset. The cross-dataset evaluation of the proposed framework on World Leaders and Presidential Deepfake Detection Datasets gives an accuracy of 83.61% and 70% respectively. The study also provides insights into how the model focuses on different parts of audio and visual features through integrated gradient analysismore » « less
-
Deepfakes have become a dual-use technology with applications in the domains of art, science, and industry. However, the technology can also be leveraged maliciously in areas such as disinformation, identity fraud, and harassment. In response to the technology's dangerous potential many deepfake creation communities have been deplatformed, including the technology's originating community – r/deepfakes. Opening in February 2018, just eight days after the removal of r/deepfakes, MrDeepFakes (MDF) went online as a privately owned platform to fulfill the role of community hub, and has since grown into the largest dedicated deepfake creation and discussion platform currently online. This position of community hub is balanced against the site's other main purpose, which is the hosting of deepfake pornography depicting public figures- produced without consent. In this paper we explore the two largest deepfake communities that have existed via a mixed methods approach utilizing quantitative and qualitative analysis. We seek to identify how these platforms were and are used by their members, what opinions these deepfakers hold about the technology and how it is seen by society at large, and identify how deepfakes-as-disinformation is viewed by the community. We find that there is a large emphasis on technical discussion on these platforms, intermixed with potentially malicious content. Additionally, we find the deplatforming of deepfake communities early in the technology's life has significantly impacted trust regarding alternative community platforms.more » « less
-
The limited information (data voids) on political topics relevant to underrepresented communities has facilitated the spread of disinformation. Independent journalists who combat disinformation in underrepresented communities have reported feeling overwhelmed because they lack the tools necessary to make sense of the information they monitor to address the data voids. In this paper, we present a system to identify and address political data voids within underrepresented communities. Armed with an interview study indicating that independent news media has the potential of addressing these data voids, we designed the intelligent system: Datavoidant. Datavoidant introduces a novel design space that focuses on providing independent journalists with a collective understanding of data voids to then facilitate generating content to cover the voids. We performed a user interface evaluation with independent news media journalists (N=22). Journalists reported that Datavoidant's features allowed them to more rapidly and easily have a sense of what was taking place in the information ecosystem to address the data voids; they also reported feeling more confident about the content they created and the unique perspectives they proposed to cover the voids. We finish by discussing how Datavoidant enables a new design space where individuals can collaboratively make sense of their information ecosystem, and can proactively devise strategies for uniquely contributing information to their ecosystem, and together prevent disinformation.more » « less