skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Combating misinformation in the age of LLMs: Opportunities and challenges
Abstract Misinformation such as fake news and rumors is a serious threat for information ecosystems and public trust. The emergence of large language models (LLMs) has great potential to reshape the landscape of combating misinformation. Generally, LLMs can be a double‐edged sword in the fight. On the one hand, LLMs bring promising opportunities for combating misinformation due to their profound world knowledge and strong reasoning abilities. Thus, one emerging question is:can we utilize LLMs to combat misinformation?On the other hand, the critical challenge is that LLMs can be easily leveraged to generate deceptive misinformation at scale. Then, another important question is:how to combat LLM‐generated misinformation?In this paper, we first systematically review the history of combating misinformation before the advent of LLMs. Then we illustrate the current efforts and present an outlook for these two fundamental questions, respectively. The goal of this survey paper is to facilitate the progress of utilizing LLMs for fighting misinformation and call for interdisciplinary efforts from different stakeholders for combating LLM‐generated misinformation.  more » « less
Award ID(s):
2241068
PAR ID:
10544410
Author(s) / Creator(s):
 ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
AI Magazine
Volume:
45
Issue:
3
ISSN:
0738-4602
Format(s):
Medium: X Size: p. 354-368
Size(s):
p. 354-368
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Well-studied techniques that enhance diversity in early design concept generation require effective metrics for evaluating human-perceived similarity between ideas. Recent work suggests collecting triplet comparisons between designs directly from human raters and using those triplets to form an embedding where similarity is expressed as a Euclidean distance. While effective at modeling human-perceived similarity judgments, these methods are expensive and require a large number of triplets to be hand-labeled. However, what if there were a way to use AI to replicate the human similarity judgments captured in triplet embedding methods? In this paper, we explore the potential for pretrained Large Language Models (LLMs) to be used in this context. Using a dataset of crowdsourced text descriptions written about engineering design sketches, we generate LLM embeddings and compare them to an embedding created from human-provided triplets of those same sketches. From these embeddings, we can use Euclidean distances to describe areas where human perception and LLM perception disagree regarding design similarity. We then implement this same procedure but with descriptions written from a template that attempts to isolate a particular modality of a design (i.e., functions, behaviors, structures). By comparing the templated description embeddings to both the triplet-generated and pre-template LLM embeddings, we identify ways of describing designs such that LLM and human similarity perception better agree. We use these results to better understand how humans and LLMs interpret similarity in engineering designs. 
    more » « less
  2. Text watermarks for large language models (LLMs) have been commonly used to identify the origins of machine-generated content, which is promising for assessing liability when combating deepfake or harmful content. While existing watermarking techniques typically prioritize robustness against removal attacks, unfortunately, they are vulnerable to spoofing attacks: malicious actors can subtly alter the meanings of LLM-generated responses or even forge harmful content, potentially misattributing blame to the LLM developer. To overcome this, we introduce a bi-level signature scheme, Bileve, which embeds fine-grained signature bits for integrity checks (mitigating spoofing attacks) as well as a coarse-grained signal to trace text sources when the signature is invalid (enhancing detectability) via a novel rank-based sampling strategy. Compared to conventional watermark detectors that only output binary results, Bileve can differentiate 5 scenarios during detection, reliably tracing text provenance and regulating LLMs. The experiments conducted on OPT-1.3B and LLaMA-7B demonstrate the effectiveness of Bileve in defeating spoofing attacks with enhanced detectability. 
    more » « less
  3. Large Language Models (LLMs) excel at various tasks, including problem-solving and question-answering. How- ever, LLMs often find Math Word Problems (MWPs) chal- lenging because solving them requires a range of reasoning and mathematical abilities with which LLMs seem to struggle. Recent efforts have helped LLMs solve more complex MWPs with improved prompts. This study proposes a novel method that initially prompts an LLM to create equations from a decomposition of the question, followed by using an external symbolic equation solver to produce an answer. To ensure the accuracy of the obtained answer, inspired by an established recommendation of math teachers, the LLM is instructed to solve the MWP a second time, but this time with the objective of estimating the correct answer instead of solving it exactly. The estimation is then compared to the generated answer to verify. If verification fails, an iterative rectification process is employed to ensure the correct answer is eventually found. This approach achieves new state-of-the-art results on datasets used by prior published research on numeric and algebraic MWPs, improving the previous best results by nearly two percent on average. In addition, the approach obtains satisfactory results on trigonometric MWPs, a task not previously attempted to the authors’ best knowledge. This study also introduces two new datasets, SVAMPClean and Trig300, to further advance the testing of LLMs’ reasoning abilities. 
    more » « less
  4. Abstract This paper presents a new approach to improve static program analysis using Large Language Models (LLMs). The approachinterleavescalls to the static analyzer and queries to the LLM. The query to the LLM is constructed based on intermediate results from the static analysis, and subsequent static analysis uses the results from the LLM query. We apply our approach to the problem oferror-specification inference: given systems code written in C, infer the set of values that each function can return on error. Such error specifications aid in program understanding and can be used to find error-handling bugs. We implemented our approach by incorporating LLMs into EESI, the state-of-the-art static analysis for error-specification inference. Compared to EESI, our approach achieves higher recall (from an average of 52.55% to 77.83%) and higher F1-score (from an average of 0.612 to 0.804) while maintaining precision (from an average of 86.67% to 85.12%) on real-world benchmarks such as Apache HTTPD and MbedTLS. We also conducted experiments to understand the sources of imprecision in our LLM-assisted analysis as well as the impact of LLM nondeterminism on the analysis results. 
    more » « less
  5. Abstract BackgroundGenerative artificial intelligence (AI) large‐language models (LLMs) have significant potential as research tools. However, the broader implications of using these tools are still emerging. Few studies have explored using LLMs to generate data for qualitative engineering education research. Purpose/HypothesisWe explore the following questions: (i) What are the affordances and limitations of using LLMs to generate qualitative data in engineering education, and (ii) in what ways might these data reproduce and reinforce dominant cultural narratives in engineering education, including narratives of high stress? Design/MethodsWe analyzed similarities and differences between LLM‐generated conversational data (ChatGPT) and qualitative interviews with engineering faculty and undergraduate engineering students from multiple institutions. We identified patterns, affordances, limitations, and underlying biases in generated data. ResultsLLM‐generated content contained similar responses to interview content. Varying the prompt persona (e.g., demographic information) increased the response variety. When prompted for ways to decrease stress in engineering education, LLM responses more readily described opportunities for structural change, while participants' responses more often described personal changes. LLM data more frequently stereotyped a response than participants did, meaning that LLM responses lacked the nuance and variation that naturally occurs in interviews. ConclusionsLLMs may be a useful tool in brainstorming, for example, during protocol development and refinement. However, the bias present in the data indicates that care must be taken when engaging with LLMs to generate data. Specially trained LLMs that are based only on data from engineering education hold promise for future research. 
    more » « less