NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Continual Learning Using a Kernel-Based Method Over Foundation Models

Momeni, Saleh; Mazumder, Sahisnu; Liu, Bing (February 2025, The 39th Annual AAAI Conference on Artificial Intelligence)

Continual learning (CL) learns a sequence of tasks incre- mentally. This paper studies the challenging CL setting of class-incremental learning (CIL). CIL has two key chal- lenges: catastrophic forgetting (CF) and inter-task class sep- aration (ICS). Despite numerous proposed methods, these issues remain persistent obstacles. This paper proposes a novel CIL method, called Kernel Linear Discriminant Analy- sis (KLDA), that can effectively avoid CF and ICS problems. It leverages only the powerful features learned in a foundation model (FM). However, directly using these features proves suboptimal. To address this, KLDA incorporates the Radial Basis Function (RBF) kernel and its Random Fourier Fea- tures (RFF) to enhance the feature representations from the FM, leading to improved performance. When a new task ar- rives, KLDA computes only the mean for each class in the task and updates a shared covariance matrix for all learned classes based on the kernelized features. Classification is performed using Linear Discriminant Analysis. Our empir- ical evaluation using text and image classification datasets demonstrates that KLDA significantly outperforms baselines. Remarkably, without relying on replay data, KLDA achieves accuracy comparable to joint training of all classes, which is considered the upper bound for CIL performance. The KLDA code is available at https://github.com/salehmomeni/klda.
more » « less
Free, publicly-accessible full text available February 25, 2026
In-Context Continual Learning Assisted by an External Continual Learner

Momeni, Saleh; Mazumder, Sahisnu; Ke, Zixuan; Liu, Bing (January 2025, The 31st International Conference on Computational Linguistics (COLING-2025))

Existing continual learning (CL) methods mainly rely on fine-tuning or adapting large language mod- els (LLMs). They still suffer from catastrophic for- getting (CF). Little work has been done to exploit in-context learning (ICL) to leverage the extensive knowledge within LLMs for CL without updating any parameters. However, incrementally learning each new task in ICL necessitates adding training examples from each class of the task to the prompt, which hampers scalability as the prompt length in- creases. This issue not only leads to excessively long prompts that exceed the input token limit of the underlying LLM but also degrades the model’s performance due to the overextended context. To address this, we introduce InCA, a novel approach that integrates an external continual learner (ECL) with ICL to enable scalable CL without CF. The ECL is built incrementally to pre-select a small subset of likely classes for each test instance. By restricting the ICL prompt to only these selected classes, InCA prevents prompt lengths from becom- ing excessively long, while maintaining high per- formance. Experimental results demonstrate that InCA significantly outperforms existing CL base- lines, achieving substantial performance gains.
more » « less
Free, publicly-accessible full text available January 19, 2026
Detecting and Characterizing Semantic Novelties in Factual Text Involving Named Entities.

Ma, Nianzu; Mazumder, Sahisnu; Politowicz, Alexander; Liu, Bing; Robertson, Eric; Grigsby, Scott (December 2022, Proceedings of The 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP-2022))

Full Text Available
Lifelong and Continual Learning Dialogue Systems: Learning during Conversation

Liu, Bing; Mazumder, Sahisnu (January 2021, Thirty-Fifth AAAI Conference on Artificial Intelligence)
null (Ed.)
Dialogue systems, also called chatbots, are now used in a wide range of applications. However, they still have some major weaknesses. One key weakness is that they are typically trained from manually-labeled data and/or written with handcrafted rules, and their knowledge bases (KBs) are also compiled by human experts. Due to the huge amount of manual effort involved, they are difficult to scale and also tend to produce many errors ought to their limited ability to understand natural language and the limited knowledge in their KBs. Thus, the level of user satisfactory is often low. In this paper, we propose to dramatically improve the situation by endowing the chatbots the ability to continually learn (1) new world knowledge, (2) new language expressions to ground them to actions, and (3) new conversational skills, during conversation by themselves so that as they chat more and more with users, they become more and more knowledgeable and are better and better able to understand diverse natural language expressions and to improve their conversational skills.
more » « less
Full Text Available
Entity-Aware Dependency-Based Deep Graph Attention Network for Comparative Preference Classification

https://doi.org/10.18653/v1/2020.acl-main.512

Ma, Nianzu; Mazumder, Sahisnu; Wang, Hao; Liu, Bing (July 2020, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics)

This paper studies the task of comparative preference classification (CPC). Given two entities in a sentence, our goal is to classify whether the first (or the second) entity is preferred over the other or no comparison is expressed at all between the two entities. Existing works either do not learn entity-aware representations well and fail to deal with sentences involving multiple entity pairs or use sequential modeling approaches that are unable to capture long-range dependencies between the entities. Some also use traditional machine learning approaches that do not generalize well. This paper proposes a novel Entity-aware Dependency-based Deep Graph Attention Network (ED-GAT) that employs a multi-hop graph attention over a dependency graph sentence representation to leverage both the semantic information from word embeddings and the syntactic information from the dependency graph to solve the problem. Empirical evaluation shows that the proposed model achieves the state-of-the-art performance in comparative preference classification.
more » « less
Full Text Available
Target-Sensitive Memory Networks for Aspect Sentiment Classification

Wang, Shuai; Mazumder, Sahisnu; Liu, Bing; Zhou, Mianwei; Chang, Yi (July 2018, Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers))

Full Text Available

Search for: All records