NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

ROSE: RADICAL Orchestrator for Surrogate Exploration

https://doi.org/10.1145/3731599.3767347

Alsaadi, Aymen; Wang, Tianle; Park, Andrew; Bajracharya, Pradeep; Wang, Linwei; Sun, Fanbo; Seal, Sudip; Jadhao, Vikram; Fox, Geoffrey; Jha, Shantenu (November 2025, ACM)

Free, publicly-accessible full text available November 15, 2026
LatticeGen: Hiding Generated Text in a Lattice for Privacy-Aware Large Language Model Generation on Cloud

Zhang, Mengke; He, Tianxing; Wang, Tianle; Mi, Lu; Mireshghallah, Niloofar; Chen, Binyi; Wang, Hao; Tsvetkov, Yulia (June 2024, NAACL)

In the current user-server interaction paradigm of prompted generation with large language models (LLMs) on cloud, the server fully controls the generation process, which leaves zero options for users who want to keep the generated text private to themselves. For privacy-aware text generation on cloud, we propose LatticeGen, a cooperative protocol in which the server still handles most of the computation while the client controls the sampling operation. The key idea is that the true generated sequence is mixed with noise tokens by the client and hidden in a noised lattice. Only the client knows which tokens are the true ones. Considering potential attacks from a hypothetically malicious server and how the client can defend against it, we propose the repeated beam-search attack and the mixing noise scheme. In our experiments we apply LatticeGen to protect both prompt and generation. It is shown that while the noised lattice degrades generation quality, LatticeGen successfully protects the true generation to a remarkable degree under strong attacks (more than 50{\%} of the semantic remains hidden as measured by BERTScore).
more » « less
Full Text Available
WOT-Class: Weakly Supervised Open-world Text Classification

https://doi.org/10.1145/3583780.3615109

Wang, Tianle; Wang, Zihan; Liu, Weitang; Shang, Jingbo (October 2023, CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management)
Imaging tunable Luttinger liquid systems in van der Waals heterostructures

https://doi.org/10.1038/s41586-024-07596-6

Li, Hongyuan; Xiang, Ziyu; Wang, Tianle; Naik, Mit H; Kim, Woochang; Nie, Jiahui; Li, Shiyu; Ge, Zhehao; He, Zehao; Ou, Yunbo; et al (July 2024, Nature)

Full Text Available
On the Blind Spots of Model-Based Evaluation Metrics for Text Generation

He, Tianxing; Zhang, Jingyu; Wang, Tianle; Kumar, Sachin; Cho, Kyunghyun; Glass, James; Tsvetkov, Yulia (July 2023, ACL: Annual Meeting of the Association for Computational Linguistics)

In this work, we explore a useful but often neglected methodology for robustness analysis of text generation evaluation metrics: stress tests with synthetic data. Basically, we design and synthesize a wide range of potential errors and check whether they result in a commensurate drop in the metric scores. We examine a range of recently proposed evaluation metrics based on pretrained language models, for the tasks of open-ended generation, translation, and summarization. Our experiments reveal interesting insensitivities, biases, or even loopholes in existing metrics. For example, we find that BERTScore is confused by truncation errors in summarization, and MAUVE (built on top of GPT-2) is insensitive to errors at the beginning or middle of generations. Further, we investigate the reasons behind these blind spots and suggest practical workarounds for a more reliable evaluation of text generation. We have released our code and data at https://github.com/cloudygoose/blindspot_nlg.
more » « less
Full Text Available
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Srivastava, Aarohi; Rastogi, Abhinav; Rao, Abhishek; Shoeb, Abu Awal; Abid, Abubakar; Fisch, Adam; Brown, Adam R.; Santoro, Adam; Gupta, Aditya; Garriga-Alonso, Adri; et al (January 2023, Transactions on machine learning research)

Full Text Available

Search for: All records