NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Offset Unlearning for Large Language Models

Huang, James Y; Zhou, Wenxuan Zhou; Wang, Fei Wang; Morstatter, Fred Morstatter; Zhang, Sheng; Poon, Hoifung Poon; Chen, Muhao (May 2025, Transactions on machine learning research)

Free, publicly-accessible full text available May 1, 2026
Parameter-Efficient Tuning with Special Token Adaptation

Yang, Xiaocong Yang; Huang, James Y.; Zhou, Wenxuan; Chen, Muhao (May 2023, Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL))
Vlachos, Andreas; Augenstein, Isabelle (Ed.)
Parameter-efficient tuning aims at updating only a small subset of parameters when adapting a pretrained model to downstream tasks. In this work, we introduce PASTA, in which we only modify the special token representations (e.g., [SEP] and [CLS] in BERT) before the self-attention module at each layer in Transformer-based models. PASTA achieves comparable performance to fine-tuning in natural language understanding tasks including text classification and NER with up to only 0.029% of total parameters trained. Our work not only provides a simple yet effective way of parameter-efficient tuning, which has a wide range of practical applications when deploying finetuned models for multiple tasks, but also demonstrates the pivotal role of special tokens in pretrained language models.
more » « less
Full Text Available
Robust Natural Language Understanding with Residual Attention Debiasing

https://doi.org/10.18653/v1/2023.findings-acl.32

Wang, Fei; Huang, James Y.; Yan, Tianyi; Zhou, Wenxuan; Chen, Muhao (January 2023, Findings of the Association for Computational Linguistics: ACL 2023)

Natural language understanding (NLU) models often suffer from unintended dataset biases. Among bias mitigation methods, ensemble-based debiasing methods, especially product-of-experts (PoE), have stood out for their impressive empirical success. However, previous ensemble-based debiasing methods typically apply debiasing on top-level logits without directly addressing biased attention patterns. Attention serves as the main media of feature interaction and aggregation in PLMs and plays a crucial role in providing robust prediction. In this paper, we propose REsidual Attention Debiasing (READ), an end-to-end debiasing method that mitigates unintended biases from attention. Experiments on three NLU benchmarks show that READ significantly improves the OOD performance of BERT-based models, including +12.9% accuracy on HANS, +11.0% accuracy on FEVER-Symmetric, and +2.7% F1 on PAWS. Detailed analyses demonstrate the crucial role of unbiased attention in robust NLU models and that READ effectively mitigates biases in attention.
more » « less
Full Text Available
Bridging Continuous and Discrete Spaces: Interpretable Sentence Representation Learning via Compositional Operations

https://doi.org/10.18653/v1/2023.emnlp-main.900

Huang, James; Yao, Wenlin; Song, Kaiqiang; Zhang, Hongming; Chen, Muhao; Yu, Dong (January 2023, Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing)

Traditional sentence embedding models encode sentences into vector representations to capture useful properties such as the semantic similarity between sentences. However, in addition to similarity, sentence semantics can also be interpreted via compositional operations such as sentence fusion or difference. It is unclear whether the compositional semantics of sentences can be directly reflected as compositional operations in the embedding space. To more effectively bridge the continuous embedding and discrete text spaces, we explore the plausibility of incorporating various compositional properties into the sentence embedding space that allows us to interpret embedding transformations as compositional sentence operations. We propose InterSent, an end-to-end framework for learning interpretable sentence embeddings that supports compositional sentence operations in the embedding space. Our method optimizes operator networks and a bottleneck encoder-decoder model to produce meaningful and interpretable sentence embeddings. Experimental results demonstrate that our method significantly improves the interpretability of sentence embeddings on four textual generation tasks over existing approaches while maintaining strong performance on traditional semantic similarity tasks.
more » « less
Unified Semantic Typing with Meaningful Label Inference

https://doi.org/10.18653/v1/2022.naacl-main.190

Huang, James Y.; Li, Bangzheng; Xu, Jiashu; Chen, Muhao (January 2022, Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies)

Semantic typing aims at classifying tokens or spans of interest in a textual context into semantic categories such as relations, entity types, and event types. The inferred labels of semantic categories meaningfully interpret how machines understand components of text. In this paper, we present UniST, a unified framework for semantic typing that captures label semantics by projecting both inputs and labels into a joint semantic embedding space. To formulate different lexical and relational semantic typing tasks as a unified task, we incorporate task descriptions to be jointly encoded with the input, allowing UniST to be adapted to different tasks without introducing task-specific model components. UniST optimizes a margin ranking loss such that the semantic relatedness of the input and labels is reflected from their embedding similarity. Our experiments demonstrate that UniST achieves strong performance across three semantic typing tasks: entity typing, relation classification and event typing. Meanwhile, UniST effectively transfers semantic knowledge of labels and substantially improves generalizability on inferring rarely seen and unseen types. In addition, multiple semantic typing tasks can be jointly trained within the unified framework, leading to a single compact multi-tasking model that performs comparably to dedicated single-task models, while offering even better transferability.
more » « less
Full Text Available
Local Environment Affects the Activity of Enzymes on a 3D Molecular Scaffold

https://doi.org/10.1021/acsnano.0c03962

Xiong, Yan; Huang, James; Wang, Shih-Ting; Zafar, Sufi; Gang, Oleg (November 2020, ACS Nano)

Full Text Available
Surface Zwitterionization of Expanded Poly(tetrafluoroethylene) via Dopamine-Assisted Consecutive Immersion Coating

https://doi.org/10.1021/acsami.0c09073

Fowler, Peter Matthew; Dizon, Gian Vincent; Tayo, Lemmuel L.; Caparanga, Alvin R.; Huang, James; Zheng, Jie; Aimar, Pierre; Chang, Yung (September 2020, ACS Applied Materials & Interfaces)
null (Ed.)
Full Text Available
Cascaded Enzyme Reactions over a Three-Dimensional, Wireframe DNA Origami Scaffold

https://doi.org/10.1021/jacsau.1c00387

Kahn, Jason S.; Xiong, Yan; Huang, James; Gang, Oleg (January 2022, JACS Au)

Search for: All records