skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Learning from Mistakes – a Framework for Neural Architecture Search
Learning from one's mistakes is an effective human learning technique where the learners focus more on the topics where mistakes were made, so as to deepen their understanding. In this paper, we investigate if this human learning strategy can be applied in machine learning. We propose a novel machine learning method called Learning From Mistakes (LFM), wherein the learner improves its ability to learn by focusing more on the mistakes during revision. We formulate LFM as a three-stage optimization problem: 1) learner learns; 2) learner re-learns focusing on the mistakes, and; 3) learner validates its learning. We develop an efficient algorithm to solve the LFM problem. We apply the LFM framework to neural architecture search on CIFAR-10, CIFAR-100, and Imagenet. Experimental results strongly demonstrate the effectiveness of our model.  more » « less
Award ID(s):
2120019
PAR ID:
10441673
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
Proceedings of the AAAI Conference on Artificial Intelligence
Volume:
36
Issue:
9
ISSN:
2159-5399
Page Range / eLocation ID:
10184 to 10192
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Bi-level optimization methods in machine learning are popularly effective in subdomains of neural architecture search, data reweighting, etc. However, most of these methods do not factor in variations in learning difficulty, which limits their performance in real-world applications. To address the above problems, we propose a framework that imitates the learning process of humans. In human learning, learners usually focus more on the topics where mistakes have been made in the past to deepen their understanding and master the knowledge. Inspired by this effective human learning technique, we propose a multilevel optimization framework, learning from mistakes (LFM), for machine learning. We formulate LFM as a three-stage optimization problem: 1) the learner learns, 2) the learner relearns based on the mistakes made before, and 3) the learner validates his learning. We develop an efficient algorithm to solve the optimization problem. We further apply our method to differentiable neural architecture search and data reweighting. Extensive experiments on CIFAR-10, CIFAR-100, ImageNet, and other related datasets powerfully demonstrate the effectiveness of our approach. The code of LFM is available at: https://github.com/importZL/LFM. 
    more » « less
  2. Simulated learners represent computational theories of human learning that can be used to evaluate educational technologies, provide practice opportunities for teachers, and advance our theoretical understanding of human learning. A key challenge in working with simulated learners is evaluating the accuracy of the simulation compared to the behavior of real human students. One way this evaluation is done is by comparing the error-rate learning curves from a population of human learners and a corresponding set of simulated learners. In this paper, we argue that this approach misses an opportunity to more accurately capture nuances in learning by treating all errors as the same. We present a simulated learner system, the Apprentice Learner (AL) Architecture, and use this more nuanced evaluation to demonstrate ways in which it does and does not explain and accurately predict student learning in terms of the reduction of different kinds of errors over time as it learns, as human students do, from an Intelligent Tutoring System (ITS). 
    more » « less
  3. Simulating real-world experiences in a safe environment has made virtual human medical simulations a common use case for research and interpersonal communication training. Despite the benefits virtual human medical simulations provide, previous work suggests that users struggle to notice when virtual humans make potentially life-threatening verbal communication mistakes inside virtual human medical simulations. In this work, we performed a 2x2 mixed design user study that had learners (n = 80) attempt to identify verbal communication mistakes made by a virtual human acting as a nurse in a virtual desktop environment. A virtual desktop environment was used instead of a head-mounted virtual reality environment due to Covid-19 limitations. The virtual desktop environment experience allowed us to explore how frequently learners identify verbal communication mistakes in virtual human medical simulations and how perceptions of credibility, reliability, and trustworthiness in the virtual human affect learner error recognition rates. We found that learners struggle to identify infrequent virtual human verbal communication mistakes. Additionally, learners with lower initial trustworthiness ratings are more likely to overlook potentially life-threatening mistakes, and virtual human mistakes temporarily lower learner credibility, reliability, and trustworthiness ratings of virtual humans. From these findings, we provide insights on improving virtual human medical simulation design. Developers can use these insights to design virtual simulations for error identification training using virtual humans. 
    more » « less
  4. A striking property of transformers is their ability to perform in-context learning (ICL), a machine learning framework in which the learner is presented with a novel context during inference implicitly through some data, and tasked with making a prediction in that context. As such, that learner must adapt to the context without additional training. We explore the role of softmax attention in an ICL setting where each context encodes a regression task. We show that an attention unit learns a window that it uses to implement a nearest-neighbors predictor adapted to the landscape of the pretraining tasks. Specifically, we show that this window widens with decreasing Lipschitzness and increasing label noise in the pretraining tasks. We also show that on low-rank, linear problems, the attention unit learns to project onto the appropriate subspace before inference. Further, we show that this adaptivity relies crucially on the softmax activation and thus cannot be replicated by the linear activation often studied in prior theoretical analyses. 
    more » « less
  5. Federated Graph Learning (FGL) is tasked with training machine learning models, such as Graph Neural Networks (GNNs), for multiple clients, each with its own graph data. Existing methods usually assume that each client has both node features and graph structure of its graph data. In real-world scenarios, however, there exist federated systems where only a part of the clients have such data while other clients (i.e. graphless clients) may only have node features. This naturally leads to a novel problem in FGL: how to jointly train a model over distributed graph data with graphless clients? In this paper, we propose a novel framework FedGLS to tackle the problem in FGL with graphless clients. In FedGLS, we devise a local graph learner on each graphless client which learns the local graph structure with the structure knowledge transferred from other clients. To enable structure knowledge transfer, we design a GNN model and a feature encoder on each client. During local training, the feature encoder retains the local graph structure knowledge together with the GNN model via knowledge distillation, and the structure knowledge is transferred among clients in global update. 
    more » « less