NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Great Memory, Shallow Reasoning: Limits of kNN-LMs

Geng, Shangyi; Zhao, Wenting; Rush, Alexander M (April 2025, NAACL)

Free, publicly-accessible full text available April 29, 2026
Challenges in Trustworthy Human Evaluation of Chatbots

Zhao, Wenting; Rush, Alexander M; Goyal, Tanya (April 2025, NAACL)

Free, publicly-accessible full text available April 29, 2026
COMMIT0: LIBRARY GENERATION FROM SCRATCH

Zhao, Wenting; Jiang, Nan; Lee, Celine; Chiu, Justin T; Cardie, Claire; Galle, Matthias; Rush, Alexander M (April 2025, ICLR)

Free, publicly-accessible full text available April 30, 2026
I Could’ve Asked That: Reformulating Unanswerable Questions

Zhao, Wenting; Gao, Ge; Cardie, Claire; Rush, Alexander M (November 2024, EMNLP)

When seeking information from unfamiliar documents, users frequently pose questions that cannot be answered by the documents. While existing large language models (LLMs) identify these unanswerable questions, they do not assist users in reformulating their questions, thereby reducing their overall utility. We curate CouldAsk, an evaluation benchmark composed of existing and new datasets for document-grounded question answering, specifically designed to study reformulating unanswerable questions. We evaluate state-of-the-art open-source and proprietary LLMs on CouldAsk. The results demonstrate the limited capabilities of these models in reformulating questions. Specifically, GPT-4 and Llama2-7B successfully reformulate questions only 26% and 12% of the time, respectively. Error analysis shows that 62% of the unsuccessful reformulations stem from the models merely rephrasing the questions or even generating identical questions. We publicly release the benchmark and the code to reproduce the experiments.
more » « less
Free, publicly-accessible full text available November 11, 2025
Language Model Inversion

Morris, John X; Zhao, Wenting; Chiu, Justin T; Shmatikov, Vitaly; Rush, Alexander M (May 2024, ICLR)

Language models produce a distribution over the next token; can we use this information to recover the prompt tokens? We consider the problem of language model inversion and show that next-token probabilities contain a surprising amount of information about the preceding text. Often we can recover the text in cases where it is hidden from the user, motivating a method for recovering unknown prompts given only the model's current distribution output. We consider a variety of model access scenarios, and show how even without predictions for every token in the vocabulary we can recover the probability vector through search. On Llama-2 7b, our inversion method reconstructs prompts with a BLEU of 59 and token-level F1 of 78 and recovers 27% of prompts exactly. Code for reproducing all experiments is available at this http URL.
more » « less
Full Text Available
I Could’ve Asked That: Reformulating Unanswerable Questions

https://doi.org/10.18653/v1/2024.emnlp-main.242

Zhao, Wenting; Gao, Ge; Cardie, Claire; Rush, Alexander M (January 2024, Association for Computational Linguistics)

Full Text Available
Abductive Commonsense Reasoning Exploiting Mutually Exclusive Explanations

https://doi.org/10.18653/v1/2023.acl-long.831

Zhao, Wenting; Chiu, Justin; Cardie, Claire; Rush, Alexander (January 2023, Association for Computational Linguistics)
Hop, Union, Generate: Explainable Multi-hop Reasoning without Rationale Supervision

https://doi.org/10.18653/v1/2023.emnlp-main.1001

Zhao, Wenting; Chiu, Justin; Cardie, Claire; Rush, Alexander (January 2023, Association for Computational Linguistics)

Full Text Available
Symbolic Planning and Code Generation for Grounded Dialogue

https://doi.org/10.18653/v1/2023.emnlp-main.460

Chiu, Justin; Zhao, Wenting; Chen, Derek; Vaduguru, Saujas; Rush, Alexander; Fried, Daniel (January 2023, Association for Computational Linguistics)
HOT-VAE: Learning High-Order Label Correlation for Multi-Label Classification via Attention-Based Variational Autoencoders.

Zhao, Wenting; Bai, Junwen; Kong, Shufeng; Fink, Daniel; Gomes, Carla (February 2021, Proceedings of the AAAI Conference on Artificial Intelligence)

Understanding how environmental characteristics affect bio- diversity patterns, from individual species to communities of species, is critical for mitigating effects of global change. A central goal for conservation planning and monitoring is the ability to accurately predict the occurrence of species com- munities and how these communities change over space and time. This in turn leads to a challenging and long-standing problem in the field of computer science - how to perform ac- curate multi-label classification with hundreds of labels? The key challenge of this problem is its exponential-sized output space with regards to the number of labels to be predicted. Therefore, it is essential to facilitate the learning process by exploiting correlations (or dependency) among labels. Previ- ous methods mostly focus on modelling the correlation on label pairs; however, complex relations between real-world objects often go beyond second order. In this paper, we pro- pose a novel framework for multi-label classification, High- order Tie-in Variational Autoencoder (HOT-VAE), which per- forms adaptive high-order label correlation learning. We ex- perimentally verify that our model outperforms the existing state-of-the-art approaches on a bird distribution dataset on both conventional F1 scores and a variety of ecological met- rics. To show our method is general, we also perform em- pirical analysis on seven other public real-world datasets in several application domains, and Hot-VAE exhibits superior performance to previous methods.
more » « less
Full Text Available

« Prev Next »

Search for: All records