skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 10:00 PM ET on Thursday, March 12 until 2:00 AM ET on Friday, March 13 due to maintenance. We apologize for the inconvenience.


Title: Boosting Logical Fallacy Reasoning in LLMs via Logical Structure Tree
Award ID(s):
2127746
PAR ID:
10664460
Author(s) / Creator(s):
;
Publisher / Repository:
Association for Computational Linguistics
Date Published:
Page Range / eLocation ID:
13157 to 13173
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Reducing a failure-inducing input to a smaller one is challenging for input with internal dependencies because most sub-inputs are invalid. Kalhauge and Palsberg made progress on this problem by mapping the task to a reduction problem for dependency graphs that avoids invalid inputs entirely. Their tool J-Reduce efficiently reduces Java bytecode to 24 percent of its original size, which made it the most effective tool until now. However, the output from their tool is often too large to be helpful in a bug report. In this paper, we show that more fine-grained modeling of dependencies leads to much more reduction. Specifically, we use propositional logic for specifying dependencies and we show how this works for Java bytecode. Once we have a propositional formula that specifies all valid sub-inputs, we run an algorithm that finds a small, valid, failure-inducing input. Our algorithm interleaves runs of the buggy program and calls to a procedure that finds a minimal satisfying assignment. Our experiments show that we can reduce Java bytecode to 4.6 percent of its original size, which is 5.3 times better than the 24.3 percent achieved by J-Reduce. The much smaller output is more suitable for bug reports. 
    more » « less
  2. ABSTRACT Joint inquiry requires agents to exchange public content about some target domain, which in turn requires them to track which content a linguistic form contributes to a conversation. But, often, the inquiry delivers a necessary truth. For example, if we are inquiring whether a particular bird, Tweety, is a woodpecker, and discover that it is, then our inquiry concluding in this fact would conclude in a necessity, and the form “Tweety is a woodpecker” expresses this necessary truth. Still, whether Tweety is a woodpecker seems a perfectly legitimate object of study, and the answers we accrue can be informative. But the dominant model of inquiry (Stalnaker, 1978, 1984) treats this situation as linguistically deviant, and diagnoses our ignorance and subsequent discovery as metalinguistic: we were ignorant, and ultimately discovered something, about the meaning of our terms. Rather than linguistic deviation, we argue this situation is the norm, and one that calls for an alternative model of inquiry. This paper develops such a model. It shows that to capture how agents can learn something informative about the world—and not merely language—even when inquiry concerns necessary facts, it's key to track how moves in discourse contribute public content onto the conversational record, but also, crucially, how those moves are connected by coherence relations to one another and to real‐world situations they are about. This allows us to capture that utterances contribute determinate, public content, while representing the information states of the interlocutors who may have only partial access to the evidence and content of the conversation, without making their ignorance metalinguistic. It lets us give precise explanations why some discourses can be transparently convincing in the conclusions they underwrite. The model thus precisifies the role of public context and shared content in anchoring an inquiry. It allows for imperfect tracking of linguistic contributions that are binding for how inquiry unfolds, and it allows for an inquiry into the status of necessary truths to be both informative, and involve empirical, rather than metalinguistic, ignorance. 
    more » « less
  3. Learning composable policies for environments with complex rules and tasks is a challenging problem. We introduce a hierarchical reinforcement learning framework called the Logical Options Framework (LOF) that learns policies that are satisfying, optimal, and composable. LOF efficiently learns policies that satisfy tasks by representing the task as an automaton and integrating it into learning and planning. We provide and prove conditions under which LOF will learn satisfying, optimal policies. And lastly, we show how LOF’s learned policies can be composed to satisfy unseen tasks with only 10-50 retraining steps on our benchmarks. We evaluate LOF on four tasks in discrete and continuous domains, including a 3D pick-and-place environment. 
    more » « less
  4. We present calf , a c ost- a ware l ogical f ramework for studying quantitative aspects of functional programs. Taking inspiration from recent work that reconstructs traditional aspects of programming languages in terms of a modal account of phase distinctions , we argue that the cost structure of programs motivates a phase distinction between intension and extension . Armed with this technology, we contribute a synthetic account of cost structure as a computational effect in which cost-aware programs enjoy an internal noninterference property: input/output behavior cannot depend on cost. As a full-spectrum dependent type theory, calf presents a unified language for programming and specification of both cost and behavior that can be integrated smoothly with existing mathematical libraries available in type theoretic proof assistants. We evaluate calf as a general framework for cost analysis by implementing two fundamental techniques for algorithm analysis: the method of recurrence relations and physicist’s method for amortized analysis . We deploy these techniques on a variety of case studies: we prove a tight, closed bound for Euclid’s algorithm, verify the amortized complexity of batched queues, and derive tight, closed bounds for the sequential and parallel complexity of merge sort, all fully mechanized in the Agda proof assistant. Lastly we substantiate the soundness of quantitative reasoning in calf by means of a model construction. 
    more » « less