Simulated learners represent computational theories of human learning that can be used to evaluate educational technologies, provide practice opportunities for teachers, and advance our theoretical understanding of human learning. A key challenge in working with simulated learners is evaluating the accuracy of the simulation compared to the behavior of real human students. One way this evaluation is done is by comparing the error-rate learning curves from a population of human learners and a corresponding set of simulated learners. In this paper, we argue that this approach misses an opportunity to more accurately capture nuances in learning by treating all errors as the same. We present a simulated learner system, the Apprentice Learner (AL) Architecture, and use this more nuanced evaluation to demonstrate ways in which it does and does not explain and accurately predict student learning in terms of the reduction of different kinds of errors over time as it learns, as human students do, from an Intelligent Tutoring System (ITS).
more »
« less
Learning from Mistakes – a Framework for Neural Architecture Search
Learning from one's mistakes is an effective human learning technique where the learners focus more on the topics where mistakes were made, so as to deepen their understanding. In this paper, we investigate if this human learning strategy can be applied in machine learning. We propose a novel machine learning method called Learning From Mistakes (LFM), wherein the learner improves its ability to learn by focusing more on the mistakes during revision. We formulate LFM as a three-stage optimization problem: 1) learner learns; 2) learner re-learns focusing on the mistakes, and; 3) learner validates its learning. We develop an efficient algorithm to solve the LFM problem. We apply the LFM framework to neural architecture search on CIFAR-10, CIFAR-100, and Imagenet. Experimental results strongly demonstrate the effectiveness of our model.
more »
« less
- Award ID(s):
- 2120019
- PAR ID:
- 10441673
- Date Published:
- Journal Name:
- Proceedings of the AAAI Conference on Artificial Intelligence
- Volume:
- 36
- Issue:
- 9
- ISSN:
- 2159-5399
- Page Range / eLocation ID:
- 10184 to 10192
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Simulating real-world experiences in a safe environment has made virtual human medical simulations a common use case for research and interpersonal communication training. Despite the benefits virtual human medical simulations provide, previous work suggests that users struggle to notice when virtual humans make potentially life-threatening verbal communication mistakes inside virtual human medical simulations. In this work, we performed a 2x2 mixed design user study that had learners (n = 80) attempt to identify verbal communication mistakes made by a virtual human acting as a nurse in a virtual desktop environment. A virtual desktop environment was used instead of a head-mounted virtual reality environment due to Covid-19 limitations. The virtual desktop environment experience allowed us to explore how frequently learners identify verbal communication mistakes in virtual human medical simulations and how perceptions of credibility, reliability, and trustworthiness in the virtual human affect learner error recognition rates. We found that learners struggle to identify infrequent virtual human verbal communication mistakes. Additionally, learners with lower initial trustworthiness ratings are more likely to overlook potentially life-threatening mistakes, and virtual human mistakes temporarily lower learner credibility, reliability, and trustworthiness ratings of virtual humans. From these findings, we provide insights on improving virtual human medical simulation design. Developers can use these insights to design virtual simulations for error identification training using virtual humans.more » « less
-
This paper presents a deep reinforcement learning algorithm for online accompaniment generation, with potential for real-time interactive human-machine duet improvisation. Different from offline music generation and harmonization, online music accompaniment requires the algorithm to respond to human input and generate the machine counterpart in a sequential order. We cast this as a reinforcement learning problem, where the generation agent learns a policy to generate a musical note (action) based on previously generated context (state). The key of this algorithm is the well-functioning reward model. Instead of defining it using music composition rules, we learn this model from monophonic and polyphonic training data. This model considers the compatibility of the machine-generated note with both the machine-generated context and the human-generated context. Experiments show that this algorithm is able to respond to the human part and generate a melodic, harmonic and diverse machine part. Subjective evaluations on preferences show that the proposed algorithm generates music pieces of higher quality than the baseline method.more » « less
-
State-of-the-art (SOTA) semi-supervised learning (SSL) methods have been highly successful in leveraging a mix of labeled and unlabeled data, often via self-training or pseudo-labeling. During pseudo-labeling, the model's predictions on unlabeled data are used for training and may result in confirmation bias where the model reinforces its own mistakes. In this work, we show that SOTA SSL methods often suffer from confirmation bias and demonstrate that this is often a result of using a poorly calibrated classifier for pseudo labeling. We introduce BaM-SSL, an efficient Bayesian Model averaging technique that improves uncertainty quantification in SSL methods with limited computational or memory overhead. We demonstrate that BaM-SSL mitigates confirmation bias in SOTA SSL methods across standard vision benchmarks of CIFAR-10, CIFAR-100, giving up to 16% improvement in test accuracy on the CIFAR-100 with 400 labels benchmark. Furthermore, we also demonstrate their effectiveness in additional realistic and challenging problems, such as class-imbalanced datasets and in photonics science.more » « less
-
As robots become pervasive in human environments, it is important to enable users to effectively convey new skills without programming. Most existing work on Interactive Reinforcement Learning focuses on interpreting and incorporating non-expert human feedback to speed up learning; we aim to design a better representation of the learning agent that is able to elicit more natural and effective communication between the human trainer and the learner, while treating human feedback as discrete communication that depends probabilistically on the trainer’s target policy. This work entails a user study where participants train a virtual agent to accomplish tasks by giving reward and/or punishment in a variety of simulated environments. We present results from 60 participants to show how a learner can ground natural language commands and adapt its action execution speed to learn more efficiently from human trainers. The agent’s action execution speed can be successfully modulated to encourage more explicit feedback from a human trainer in areas of the state space where there is high uncertainty. Our results show that our novel adaptive speed agent dominates different fixed speed agents on several measures of performance. Additionally, we investigate the impact of instructions on user performance and user preference in training conditions.more » « less