Questions Under Discussion (QUD) is a versatile linguistic framework in which discourse progresses as continuously asking questions and answering them. Automatic parsing of a discourse to produce a QUD structure thus entails a complex question generation task: given a document and an answer sentence, generate a question that satisfies linguistic constraints of QUD and can be grounded in an anchor sentence in prior context. These questions are known to be curiosity-driven and open-ended. This work introduces the first framework for the automatic evaluation of QUD parsing, instantiating the theoretical constraints of QUD in a concrete protocol. We present QUDeval, a dataset of fine-grained evaluation of 2,190 QUD questions generated from both fine-tuned systems and LLMs. Using QUDeval, we show that satisfying all constraints of QUD is still challenging for modern LLMs, and that existing evaluation metrics poorly approximate parser quality. Encouragingly, human-authored QUDs are scored highly by our human evaluators, suggesting that there is headroom for further progress on language modeling to improve both QUD parsing and QUD evaluation.
more »
« less
This content will become publicly available on October 27, 2025
Handling Open-Vocabulary Constructs in Formalizing Speci- fications: Retrieval-Augmented Parsing with Expert Knowl- edge
We study the problem of Open-Vocabulary Constructs (OVCs)—ones not known beforehand—in the context of converting natural language (NL) specifications into formal languages (e.g., temporal logic or code). Mod- els fare poorly on OVCs due to a lack of necessary knowledge a priori. In such situations, a domain expert can provide correct constructs at in- ference time based on their preferences or domain knowledge. Our goal is to effectively reuse this inference-time, expert-provided knowledge for future parses without retraining the model. We present dynamic knowledge- augmented parsing (DKAP), where in addition to the input sentence, the model receives (dynamically growing) expert knowledge as a key-value lexicon that associates NL phrases with correct OVC constructs. We pro- pose ROLEX, a retrieval-augmented parsing approach that uses this lexicon. A retriever and a generator are trained to find and use the key-value store to produce the correct parse. A key challenge lies in curating data for this retrieval-augmented parser. We utilize synthetic data generation and the data augmentation techniques on annotated (NL sentence, FL statement) pairs to train the augmented parser. To improve training effectiveness, we propose multiple strategies to teach models to focus on the relevant subset of retrieved knowledge. Finally, we introduce a new evaluation paradigm modeled after the DKAP problem and simulate the scenario across three formalization tasks (NL2LTL, NL2Code, and NL2CMD). Our evaluations show that DKAP is a difficult challenge, and ROLEX helps improve the performance of baseline models by using dynamic expert knowledge effectively.
more »
« less
- Award ID(s):
- 1918225
- PAR ID:
- 10556861
- Publisher / Repository:
- https://openreview.net/forum?id=BgvgMxY8s5
- Date Published:
- Format(s):
- Medium: X
- Location:
- Philadelphia, PA
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Computing devices have recently become capable of interacting with their end users via natural language. However, they can only operate within a limited “supported” domain of discourse and fail drastically when faced with an out-of-domain utterance, mainly due to the limitations of their semantic parser. In this paper, we propose a semantic parser that generalizes to out-of-domain examples by learning a general strategy for parsing an unseen utterance through adapting the logical forms of seen utterances, instead of learning to generate a logical form from scratch. Our parser maintains a memory consisting of a representative subset of the seen utterances paired with their logical forms. Given an unseen utterance, our parser works by looking up a similar utterance from the memory and adapting its logical form until it fits the unseen utterance. Moreover, we present a data generation strategy for constructing utterance-logical form pairs from different domains. Our results show an improvement of up to 68.8% on one-shot parsing under two different evaluation settings compared to the baselines.more » « less
-
Shi, Huang, and Lee (2017) obtained state-of-the-art results for English and Chinese dependency parsing by combining dynamic-programming implementations of transition-based dependency parsers with a minimal set of bidirectional LSTM features. However, their results were limited to projective parsing. In this paper, we extend their approach to support non-projectivity by providing the first practical implementation of the MH4 algorithm, an O(n^4) mildly nonprojective dynamic-programming parser with very high coverage on non-projective treebanks. To make MH4 compatible with minimal transition-based feature sets, we introduce a transition-based interpretation of it in which parser items are mapped to sequences of transitions. We thus obtain the first implementation of global decoding for non-projective transition-based parsing, and demonstrate empirically that it is more effective than its projective counterpart in parsing a number of highly non-projective languages.more » « less
-
Automatically transforming programs is hard, yet critical for automated program refactoring, rewriting, and repair. Multi-language syntax transformation is especially hard due to heterogeneous representations in syntax, parse trees, and abstract syntax trees (ASTs). Our insight is that the problem can be decomposed such that (1) a common grammar expresses the central context-free language (CFL) properties shared by many contemporary languages and (2) open extension points in the grammar allow customizing syntax (e.g., for balanced delimiters) and hooks in smaller parsers to handle language-specific syntax (e.g., for comments). Our key contribution operationalizes this decomposition using a Parser Parser combinator (PPC), a mechanism that generates parsers for matching syntactic fragments in source code by parsing declarative user-supplied templates. This allows our approach to detach from translating input programs to any particular abstract syntax tree representation, and lifts syntax rewriting to a modularly-defined parsing problem. A notable effect is that we skirt the complexity and burden of defining additional translation layers between concrete user input templates and an underlying abstract syntax representation. We demonstrate that these ideas admit efficient and declarative rewrite templates across 12 languages, and validate effectiveness of our approach by producing correct and desirable lightweight transformations on popular real-world projects (over 50 syntactic changes produced by our approach have been merged into 40+). Our declarative rewrite patterns require an order of magnitude less code compared to analog implementations in existing, language-specific tools.more » « less
-
This paper presents a conversational pipeline for crafting domain knowledge for complex neuro-symbolic models through natural language prompts. It leverages large language models to generate declarative programs in the DomiKnowS framework. The programs in this framework express concepts and their relationships as a graph in addition to logical constraints between them. The graph, later, can be connected to trainable neural models according to those specifications. Our proposed pipeline utilizes techniques like dynamic in-context demonstration retrieval, model refinement based on feedback from a symbolic parser, visualization, and user interaction to generate the tasks’ structure and formal knowledge representation. This approach empowers domain experts, even those not well-versed in ML/AI, to formally declare their knowledge to be incorporated in customized neural models in the DomiKnowS framework.more » « less