A model that maps the requisite skills, or knowledge components, to the contents of an online course is necessary to implement many adaptive learning technologies. However, developing a skill model and tagging courseware contents with individual skills can be expensive and error prone. We propose a technology to automatically identify latent skills from instructional text on existing online courseware called Smart (Skill Model mining with Automated detection of Resemblance among Texts). Smart is capable of mining, labeling, and mapping skills without using an existing skill model or student learning (aka response) data. The goal of our proposed approach is to mine latent skills from assessment items included in existing courseware, provide discovered skills with human-friendly labels, and map didactic paragraph texts with skills. This way, mapping between assessment items and paragraph texts is formed. In doing so, automated skill models produced by Smart will reduce the workload of courseware developers while enabling adaptive online content at the launch of the course. In our evaluation study, we applied Smart to two existing authentic online courses. We then compared machine-generated skill models and human-crafted skill models in terms of the accuracy of predicting students’ learning. We also evaluated the similarity between machine-generated and human-crafted skill models. The results show that student models based on Smart-generated skill models were equally predictive of students’ learning as those based on human-crafted skill models— as validated on two OLI (Open Learning Initiative) courses. Also, Smart can generate skill models that are highly similar to human-crafted models as evidenced by the normalized mutual information (NMI) values.
more »
« less
RL-Duet: Online Music Accompaniment Generation Using Deep Reinforcement Learning
This paper presents a deep reinforcement learning algorithm for online accompaniment generation, with potential for real-time interactive human-machine duet improvisation. Different from offline music generation and harmonization, online music accompaniment requires the algorithm to respond to human input and generate the machine counterpart in a sequential order. We cast this as a reinforcement learning problem, where the generation agent learns a policy to generate a musical note (action) based on previously generated context (state). The key of this algorithm is the well-functioning reward model. Instead of defining it using music composition rules, we learn this model from monophonic and polyphonic training data. This model considers the compatibility of the machine-generated note with both the machine-generated context and the human-generated context. Experiments show that this algorithm is able to respond to the human part and generate a melodic, harmonic and diverse machine part. Subjective evaluations on preferences show that the proposed algorithm generates music pieces of higher quality than the baseline method.
more »
« less
- Award ID(s):
- 1922591
- PAR ID:
- 10191377
- Date Published:
- Journal Name:
- Proceedings of the AAAI Conference on Artificial Intelligence
- Volume:
- 34
- Issue:
- 01
- ISSN:
- 2159-5399
- Page Range / eLocation ID:
- 710 to 718
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Participating in online communities has significant benefits to students learning in terms of students’ motivation, persistence, and learning outcomes. However, maintaining and supporting online learning communities is very challenging and requires tremendous work. Automatic support is desirable in this situation. The purpose of this work is to explore the use of deep learning algorithms for automatic text generation in providing emotional and community support for a massive online learning community, Scratch. Particularly, state-of-art deep learning language models GPT-2 and recurrent neural network (RNN) are trained using two million comments from the online learning community. We then conduct both a readability test and human evaluation on the automatically generated results for offering support to the online students. The results show that the GPT-2 language model can provide timely and human-written like replies in a style genuine to the data set and context for offering related support.more » « less
-
A model that maps the requisite skills, or knowledge components, to the contents of an online course is necessary to implement many adaptive learning technologies. However, developing a skill model and tagging courseware contents with individual skills can be expensive and error prone. We propose a technology to automatically identify latent skills from instructional text on existing online courseware called Smart (Skill Model mining with Automated detection of Resemblance among Texts). Smart is capable of mining, labeling, and mapping skills without using an existing skill model or student learning (aka response) data. The goal of our proposed approach is to mine latent skills from assessment items included in existing courseware, provide discovered skills with human-friendly labels, and map didactic paragraph texts with skills. This way, mapping between assessment items and paragraph texts is formed. In doing so, automated skill models produced by Smart will reduce the workload of courseware developers while enabling adaptive online content at the launch of the course. In our evaluation study, we applied Smart to two existing authentic online courses. We then compared machine-generated skill models and human-crafted skill models in terms of the accuracy of predicting students’ learning. We also evaluated the similarity between machine-generated and human-crafted skill models. The results show that student models based on Smart-generated skill models were equally predictive of students’ learning as those based on human-crafted skill models— as validated on two OLI (Open Learning Initiative) courses. Also, Smart can generate skill models that are highly similar to human-crafted models as evidenced by the normalized mutual information (NMI) values.more » « less
-
This paper presents a framework to learn the reward function underlying high-level sequential tasks from demonstrations. The purpose of reward learning, in the context of learning from demonstration (LfD), is to generate policies that mimic the demonstrator’s policies, thereby enabling imitation learning. We focus on a human-robot interaction(HRI) domain where the goal is to learn and model structured interactions between a human and a robot. Such interactions can be modeled as a partially observable Markov decision process (POMDP) where the partial observability is caused by uncertainties associated with the ways humans respond to different stimuli. The key challenge in finding a good policy in such a POMDP is determining the reward function that was observed by the demonstrator. Existing inverse reinforcement learning(IRL) methods for POMDPs are computationally very expensive and the problem is not well understood. In comparison, IRL algorithms for Markov decision process (MDP) are well defined and computationally efficient. We propose an approach of reward function learning for high-level sequential tasks from human demonstrations where the core idea is to reduce the underlying POMDP to an MDP and apply any efficient MDP-IRL algorithm. Our extensive experiments suggest that the reward function learned this way generates POMDP policies that mimic the policies of the demonstrator well.more » « less
-
We conduct a large-scale, systematic study to evaluate the existing evaluation methods for natural language generation in the context of generating online product reviews. We compare human-based evaluators with a variety of automated evaluation procedures, including discriminative evaluators that measure how well machine-generated text can be distinguished from human-written text, as well as word overlap metrics that assess how similar the generated text compares to human-written references. We determine to what extent these different evaluators agree on the ranking of a dozen of state-of-the-art generators for online product reviews. We find that human evaluators do not correlate well with discriminative evaluators, leaving a bigger question of whether adversarial accuracy is the correct objective for natural language generation. In general, distinguishing machine-generated text is challenging even for human evaluators, and human decisions correlate better with lexical overlaps. We find lexical diversity an intriguing metric that is indicative of the assessments of different evaluators. A post-experiment survey of participants provides insights into how to evaluate and improve the quality of natural language generation systems.more » « less