skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: CopyBench: Measuring Literal and Non-Literal Reproduction of Copyright-Protected Text in Language Model Generation
Award ID(s):
2044660
PAR ID:
10615701
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Publisher / Repository:
Association for Computational Linguistics
Date Published:
Page Range / eLocation ID:
15134 to 15158
Format(s):
Medium: X
Location:
Miami, Florida, USA
Sponsoring Org:
National Science Foundation
More Like this
  1. Berg, Jeremias; Nordström, Jakob (Ed.)
    We present a lightweight reencoding technique that augments propositional formulas containing implicit or explicit exactly-one constraints with auxiliary variables derived from the order encoding. Our approach is based on the observation that many formulas contain clauses where each literal appears only in that clause, and that these unique literal clauses can be replaced by the corresponding sequential counter encoding of exactly-one constraints, which introduces the same variables as the order encoding. We implemented the reencoding in the state-of-the-art SAT solver CaDiCaL with support for proof logging and solution reconstruction. Experiments on SAT Competition benchmarks demonstrate that our technique enables solving dozens of additional formulas. We found that shuffling a formula before reencoding harms performance. To mitigate this issue, we introduce a method that sorts literals within clauses based on the formula structure before applying our reencoding. The same technique also predicts whether reencoding is likely to yield improvements. 
    more » « less
  2. The effectiveness of satisfiability solvers strongly depends on the quality of the encoding of a given problem into conjunctive normal form. Cardinality constraints are prevalent in numerous problems, prompting the development and study of various types of encoding. We present a novel approach to optimizing cardinality constraint encodings by exploring the impact of literal orderings within the constraints. By strategically placing related literals nearby each other, the encoding generates auxiliary variables in a hierarchical structure, enabling the solver to reason more abstractly about groups of related literals. Unlike conventional metrics such as formula size or propagation strength, our method leverages structural properties of the formula to redefine the roles of auxiliary variables to enhance the solver's learning capabilities. The experimental evaluation on benchmarks from the maximum satisfiability competition demonstrates that literal orderings can be more influential than the choice of the encoding type. Our literal ordering technique improves solver performance across various encoding techniques, underscoring the robustness of our approach. 
    more » « less
  3. Verifying political claims is a challenging task, as politicians can use various tactics to subtly misrepresent the facts for their agenda. Existing automatic fact-checking systems fall short here, and their predictions like "half-true" are not very useful in isolation, since it is unclear which parts of a claim are true or false. In this work, we focus on decomposing a complex claim into a comprehensive set of yes-no subquestions whose answers influence the veracity of the claim. We present CLAIMDECOMP, a dataset of decompositions for over 1000 claims. Given a claim and its verification paragraph written by fact-checkers, our trained annotators write subquestions covering both explicit propositions of the original claim and its implicit facets, such as additional political context that changes our view of the claim's veracity. We study whether state-of-the-art pre-trained models can learn to generate such subquestions. Our experiments show that these models generate reasonable questions, but predicting implied subquestions based only on the claim (without consulting other evidence) remains challenging. Nevertheless, we show that predicted subquestions can help identify relevant evidence to fact-check the full claim and derive the veracity through their answers, suggesting that claim decomposition can be a useful piece of a fact-checking pipeline. 
    more » « less
  4. The onus on the average person is greater than ever before to make sense of large amounts of readily accessible quantitative information, but the ability and confidence to do so are frequently lacking. Many people lack practical mathematical skills that are essential for evaluating risks, probabilities and numerical outcomes such as survival rates for medical treatments, income from retirement savings plans or monetary damages in civil trials. In this Review, we integrate research on objective and subjective numeracy, focusing on cognitive and metacognitive factors that distort human perceptions and foment systematic biases in judgement and decision making. Paradoxically, an important implication of this research is that a literal focus on objective numbers and mechanical number crunching is misguided. Numbers can be a matter of life and death but a person who uses rote strategies (verbatim representations) cannot take advantage of the information contained in the numbers because 'rote' strategies are, by definition, processing without meaning. Verbatim representations (verbatim is only surface form, not meaning) treat numbers as data as opposed to information. We highlight a contrasting approach of gist extraction: organizing numbers meaningfully, interpreting them qualitatively and making meaningful inferences about them. Efforts to improve numerical cognition and its practical applications can benefit from emphasizing the qualitative meaning of numbers in context - the gist - building on the strengths of humans as intuitive mathematicians. Thus, we conclude by reviewing evidence that gist training facilitates transfer to new contexts and, because it is more durable, longer-lasting improvements in decision making. 
    more » « less