skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Learning what to remember
We consider a lifelong learning scenario in which a learner faces a neverending and arbitrary stream of facts and has to decide which ones to retain in its limited memory. We introduce a mathematical model based on the online learning framework, in which the learner measures itself against a collection of experts that are also memory-constrained and that reflect different policies for what to remember. Interspersed with the stream of facts are occasional questions, and on each of these the learner incurs a loss if it has not remembered the corresponding fact. Its goal is to do almost as well as the best expert in hindsight, while using roughly the same amount of memory. We identify difficulties with using the multiplicative weights update algorithm in this memory-constrained scenario, and design an alternative scheme whose regret guarantees are close to the best possible.  more » « less
Award ID(s):
1813160
PAR ID:
10343006
Author(s) / Creator(s):
;
Editor(s):
Dasgupta, S.; Haghtalab, N.
Date Published:
Journal Name:
Proceedings of Machine Learning Research
Volume:
167
ISSN:
2640-3498
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In supervised continual learning, a deep neural network (DNN) is updated with an ever-growing data stream. Unlike the offline setting where data is shuffled, we cannot make any distributional assumptions about the data stream. Ideally, only one pass through the dataset is needed for computational efficiency. However, existing methods are inadequate and make many assumptions that cannot be made for real-world applications, while simultaneously failing to improve computational efficiency. In this paper, we propose a novel continual learning method, SIESTA based on wake/sleep framework for training, which is well aligned to the needs of on-device learning. The major goal of SIESTA is to advance compute efficient continual learning so that DNNs can be updated efficiently using far less time and energy. The principal innovations of SIESTA are: 1) rapid online updates using a rehearsal-free, backpropagation-free, and data-driven network update rule during its wake phase, and 2) expedited memory consolidation using a compute-restricted rehearsal policy during its sleep phase. For memory efficiency, SIESTA adapts latent rehearsal using memory indexing from REMIND. Compared to REMIND and prior arts, SIESTA is far more computationally efficient, enabling continual learning on ImageNet-1K in under 2 hours on a single GPU; moreover, in the augmentation-free setting it matches the performance of the offline learner, a milestone critical to driving adoption of continual learning in real-world applications. 
    more » « less
  2. We study learning in a dynamically evolving environment modeled as a Markov game between a learner and a strategic opponent that can adapt to the learner’s strategies. While most existing works in Markov games focus on external regret as the learning objective, external regret becomes inadequate when the adversaries are adaptive. In this work, we focus on policy regret – a counterfactual notion that aims to compete with the return that would have been attained if the learner had followed the best fixed sequence of policy, in hindsight. We show that if the opponent has unbounded memory or if it is non-stationary, then sample-efficient learning is not possible. For memory-bounded and stationary adversaries, we show that learning is still statistically hard if the set of feasible strategies for the learner is exponentially large. To guarantee learnability, we introduce a new notion of consistent adaptive adversaries, wherein, the adversary responds similarly to similar strategies of the learner. We provide algorithms that achieve √ T policy regret against memorybounded, stationary, and consistent adversaries. 
    more » « less
  3. Some people exhibit impressive memory for a wide array of semantic knowledge. What makes these trivia experts better able to learn and retain novel facts? We hypothesized that new semantic knowledge may be more strongly linked to its episodic context in trivia experts. We designed a novel online task in which 132 participants varying in trivia expertise encoded “exhibits” of naturalistic facts with related photos in one of two “museums.” Afterward, participants were tested on cued recall of facts and recognition of the associated photo and museum. Greater trivia expertise predicted higher cued recall for novel facts. Critically, trivia experts but not non-experts showed superior fact recall when they remembered both features (photo and museum) of the encoding context. These findings illustrate enhanced links between episodic memory and new semantic learning in trivia experts, and show the value of studying trivia experts as a special population that can shed light on the mechanisms of memory. 
    more » « less
  4. Abstract Curiosity can be a powerful motivator to learn and retain new information. Evidence shows that high states of curiosity elicited by a specific source (i.e., a trivia question) can promote memory for incidental stimuli (non-target) presented close in time. The spreading effect of curiosity states on memory for other information has potential for educational applications. Specifically, it could provide techniques to improve learning for information that did not spark a sense of curiosity on its own. Here, we investigated how high states of curiosity induced through trivia questions affect memory performance for unrelated scholastic facts (e.g., scientific, English, or historical facts) presented in close temporal proximity to the trivia question. Across three task versions, participants viewed trivia questions closely followed in time by a scholastic fact unrelated to the trivia question, either just prior to or immediately following the answer to the trivia question. Participants then completed a surprise multiple-choice memory test (akin to a pop quiz) for the scholastic material. In all three task versions, memory performance was poorer for scholastic facts presented after trivia questions that had elicited high versus low levels of curiosity. These results contradict previous findings showing curiosity-enhanced memory for incidentally presented visual stimuli and suggest that target information that generates a high-curiosity state interferes with encoding complex and unrelated scholastic facts presented close in time. 
    more » « less
  5. Continual learning has gained substantial attention within the deep learning community, offering promising solutions to the challenging problem of sequential learning. Yet, a largely unexplored facet of this paradigm is its susceptibility to adversarial attacks, especially with the aim of inducing forgetting. In this paper, we introduce “Brain-Wash,” a novel data poisoning method tailored to impose forgetting on a continual learner. By adding the Brain-Wash noise to a variety of baselines, we demonstrate how a trained continual learner can be induced to forget its previously learned tasks catastrophically, even when using these continual learning baselines. An important feature of our approach is that the attacker requires no access to previous tasks' data and is armed merely with the model's current parameters and the data belonging to the most recent task. Our extensive experiments highlight the efficacy of Brain Wash, showcasing degradation in performance across various regularization and memory replay-based continual learning methods. Our code is available here: https://github.com/mint-vuIBrainwash 
    more » « less