NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

FactBench: A Dynamic Benchmark for In-the-Wild Language Model Factuality Evaluation

Bayat, Farima Fatahi; Zhang, Lechen; Munir, Sheza; Wang, Lu (July 2025, Association for Computational Linguistics)

Free, publicly-accessible full text available July 1, 2026
Evaluating Design Choices in Verifiable Generation with Open-source Models

Cao, Shuyang; Wang, Lu (May 2025, Association for Computational Linguistics)

Free, publicly-accessible full text available May 1, 2026
Electrode-Assisted Switching in Memristors Based on Single-Crystal Transition Metal Dichalcogenides

https://doi.org/10.1021/acsami.5c03361

Kirk, Dakotah M; Wang, Lu; Kuroda, Marcelo A (June 2025, ACS Applied Materials & Interfaces)

Free, publicly-accessible full text available June 11, 2026
Using interpretable survival analysis to assess hospital length of stay

https://doi.org/10.1186/s12913-025-12852-0

Li, Yan; Hall, Trevor; Razak, Fahad; Verma, Amol; Chignell, Mark; Wang, Lu (May 2025, BMC Health Services Research)

Abstract Accurate in-hospital length of stay prediction is a vital quality metric for hospital leaders and health policy decision-makers. It assists with decision-making and informs hospital operations involving factors such as patient flow, elective cases, and human resources allocation, while also informing quality of care and risk considerations. The aim of the research reported in this paper is to use survival analysis to model General Internal Medicine (GIM) length of stay, and to use Shapley value to support interpretation of the resulting model. Survival analysis aims to predict the time until a specific event occurs. In our study, we predict the duration from patient admission to discharge to home, i.e., in-hospital length of stay. In addition to discussing the modeling results, we also talk about how survival analysis of hospital length of stay can be used to guide improvements in the efficiency of hospital operations and support the development of quality metrics.
more » « less
Safe and Balanced: A Framework for Constrained Multi-Objective Reinforcement Learning

https://doi.org/10.1109/TPAMI.2025.3528944

Gu, Shangding; Sel, Bilgehan; Ding, Yuhao; Wang, Lu; Lin, Qingwei; Knoll, Alois; Jin, Ming (May 2025, IEEE Transactions on Pattern Analysis and Machine Intelligence)

Free, publicly-accessible full text available May 1, 2026
Lower bounds on density for topologically nontrivial minimal cones up to dimension six

https://doi.org/10.1017/fms.2025.10080

Bernstein, Jacob; Wang, Lu (January 2025, Forum of Mathematics, Sigma)

Abstract We prove lower bounds on the density of regular minimal cones of dimension less than seven provided the complements of the cones are topologically nontrivial.
more » « less
Free, publicly-accessible full text available January 1, 2026
Enhancing Language Model Factuality via Activation-Based Confidence Calibration and Guided Decoding

https://doi.org/10.18653/v1/2024.emnlp-main.583

Liu, Xin; Fatahi_Bayat, Farima; Wang, Lu (November 2024, Association for Computational Linguistics)

Free, publicly-accessible full text available November 1, 2025
Scientific Opinion Summarization: Paper Meta-review Generation Dataset, Methods, and Evaluation

https://doi.org/10.1007/978-981-97-9536-9_2

Zeng, Qi; Sidhu, Mankeerat; Blume, Ansel; Chan, Hou Pong; Wang, Lu; Ji, Heng (January 2025, Springer Nature Singapore)

Free, publicly-accessible full text available January 1, 2026
Scalable Fine-tuning from Multiple Data Sources: A First-Order Approximation Approach

Li, Dongyue; Zhang, Ziniu; Wang, Lu; Zhang, Hongyang R (November 2024, Findings of the Association for Computational Linguistics: EMNLP 2024)

We study the problem of fine-tuning a language model (LM) for a target task by optimally using the information from n auxiliary tasks. This problem has broad applications in NLP, such as targeted instruction tuning and data selection in chain-of-thought fine-tuning. The key challenge of this problem is that not all auxiliary tasks are useful to improve the performance of the target task. Thus, choosing the right subset of auxiliary tasks is crucial. Conventional subset selection methods, such as forward & backward selection, are unsuitable for LM fine-tuning because they require repeated training on subsets of auxiliary tasks. This paper introduces a new algorithm to estimate model fine-tuning performances without repeated training. Our algorithm first performs multitask training using the data of all the tasks to obtain a meta initialization. Then, we approximate the model fine-tuning loss of a subset using functional values and gradients from the meta initialization. Empirically, we find that this gradient-based approximation holds with remarkable accuracy for twelve transformer-based LMs. Thus, we can now estimate fine-tuning performances on CPUs within a few seconds. We conduct extensive experiments to validate our approach, delivering a speedup of 30× over conventional subset selection while incurring only 1% error of the true fine-tuning performances. In downstream evaluations of instruction tuning and chain-of-thought fine-tuning, our approach improves over prior methods that utilize gradient or representation similarity for subset selection by up to 3.8%.
more » « less
Free, publicly-accessible full text available November 16, 2025
Verifiable Generation with Subsentence-Level Fine-Grained Citations

Cao, Shuyang; Wang, Lu (August 2024, Findings of the Annual Meeting of the Association for Computational Linguistics (Findings of ACL))

Verifiable generation requires large language models (LLMs) to cite source documents supporting their outputs, thereby improve output transparency and trustworthiness. Yet, previous work mainly targets the generation of sentencelevel citations, lacking specificity about which part of the sentence is backed by which cited source. This work studies verifiable generation with subsentence-level fine-grained citations to locate the generated content that is supported by the cited sources in a more precise way. We first present a dataset, SCIFI, comprising 10K Wikipedia paragraphs with subsentence-level citations.1 Each paragraph in SCIFI is paired with a set of candidate source documents for citation and a query that triggers the generation of the paragraph content. On SCIFI, we then evaluate the performance of state-of-the-a rt LLMs and strategies for processing long documents designed for these models. Our experiment results reveal key factors that can enhance the quality of citations, including the expansion of the source documents’ context to be accessible to the models and the implementation of specialized model tuning.
more » « less
Full Text Available

« Prev Next »

Search for: All records