NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Curriculum Design for Machine Learners in Sequential Decision Tasks

https://doi.org/10.1109/TETCI.2018.2829980

Peng, Bei; MacGlashan, James; Loftin, Robert; Littman, Michael L.; Roberts, David L.; Taylor, Matthew E. (August 2018, IEEE Transactions on Emerging Topics in Computational Intelligence)

Full Text Available
Social is special: A normative framework for teaching with and learning from evaluative feedback

https://doi.org/10.1016/j.cognition.2017.03.006

Ho, Mark K.; MacGlashan, James; Littman, Michael L.; Cushman, Fiery (October 2017, Cognition)

Full Text Available
Environment-Independent Task Specifications via GLTL

Littman, Michael L.; Topcu, Ufuk; Fu, Jie; Isbell, Charles; Wen, Min; MacGlashan, James (April 2017, arXiv.org)

We propose a new task-specification language for Markov decision processes that is designed to be an improvement over reward functions by being environment independent. The language is a variant of Linear Temporal Logic (LTL) that is extended to probabilistic specifications in a way that permits approximations to be learned in finite time. We provide several small environments that demonstrate the advantages of our geometric LTL (GLTL) language and illustrate how it can be used to specify standard reinforcement-learning tasks straightforwardly.
more » « less
Full Text Available
Curriculum Design for Machine Learners in Sequential Decision Tasks

Peng, Bei; MacGlashan, James; Loftin, Robert; Littman, Michael L.; Roberts, David L.; Taylor, Matthew E. (April 2017, AAMAS)

Existing machine-learning work has shown that algorithms can bene t from curricula---learning fi rst on simple examples before moving to more difficult examples. While most existing work on curriculum learning focuses on developing automatic methods to iteratively select training examples with increasing difficulty tailored to the current ability of the learner, relatively little attention has been paid to the ways in which humans design curricula. We argue that a better understanding of the human-designed curricula could give us insights into the development of new machine-learning algorithms and interfaces that can better accommodate machine- or human-created curricula. Our work addresses this emerging and vital area empirically, taking an important step to characterize the nature of human-designed curricula relative to the space of possible curricula and the performance benefits that may (or may not) occur.
more » « less
Full Text Available
Initial Progress Toward Development of a Voice-Based Computer-Delivered Motivational Intervention for Heavy Drinking College Students: An Experimental Study

https://doi.org/10.2196/mental.7571

Kahler, Christopher W; Lechner, William J; MacGlashan, James; Wray, Tyler B; Littman, Michael L (January 2017, JMIR Mental Health)

Full Text Available
Interactive Learning from Policy-Dependent Human Feedback

MacGlashan, James; K Ho, Mark; Loftin, Robert; Peng, Bei; Wang, Guan; Roberts, David L.; Taylor, Matthew E.; Littman, Michael L. (July 2017, ICML)

This paper investigates the problem of interactively learning behaviors communicated by a human teacher using positive and negative feedback. Much previous work on this problem has made the assumption that people provide feedback for decisions that is dependent on the behavior they are teaching and is independent from the learner’s current policy. We present empirical results that show this assumption to be false—whether human trainers give a positive or negative feedback for a decision is influenced by the learner’s current policy. Based on this insight, we introduce Convergent Actor-Critic by Humans (COACH), an algorithm for learning from policy-dependent feedback that converges to a local optimum. Finally, we demonstrate that COACH can successfully learn multiple behaviors on a physical robot.
more » « less
Full Text Available
Planning with Abstract Markov Decision Processes

Gopalan, Nakul; desJardins, Marie; Littman, Michael L.; MacGlashan, James; Squire, Shawn; Tellex, Stefanie; Winder, John; Wong, Lawson L.S. (January 2017, ICAPS)

Robots acting in human-scale environments must plan under uncertainty in large state–action spaces and face constantly changing reward functions as requirements and goals change. Planning under uncertainty in large state–action spaces requires hierarchical abstraction for efficient computation. We introduce a new hierarchical planning framework called Abstract Markov Decision Processes (AMDPs) that can plan in a fraction of the time needed for complex decision making in ordinary MDPs. AMDPs provide abstract states, actions, and transition dynamics in multiple layers above a base-level “flat” MDP. AMDPs decompose problems into a series of subtasks with both local reward and local transition functions used to create policies for subtasks. The resulting hierarchical planning method is independently optimal at each level of abstraction, and is recursively optimal when the local reward and transition functions are correct. We present empirical results showing significantly improved planning speed, while maintaining solution quality, in the Taxi domain and in a mobile-manipulation robotics problem. Furthermore, our approach allows specification of a decision-making model for a mobile-manipulation problem on a Turtlebot, spanning from low-level control actions operating on continuous variables all the way up through high-level object manipulation tasks.
more » « less
Full Text Available
A Need for Speed: Adapting Agent Action Speed to Improve Task Learning from Non-Expert Humans

Peng, Bei; MacGlashan, James; Loftin, Robert; Littman, Michael L.; Roberts, David L.; Taylor, Matthew E. (July 2016, Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems)

As robots become pervasive in human environments, it is important to enable users to effectively convey new skills without programming. Most existing work on Interactive Reinforcement Learning focuses on interpreting and incorporating non-expert human feedback to speed up learning; we aim to design a better representation of the learning agent that is able to elicit more natural and effective communication between the human trainer and the learner, while treating human feedback as discrete communication that depends probabilistically on the trainer's target policy. This work entails a user study where participants train a virtual agent to accomplish tasks by giving reward and/or punishment in a variety of simulated environments. We present results from 60 participants to show how a learner can ground natural language commands and adapt its action execution speed to learn more efficiently from human trainers. The agent's action execution speed can be successfully modulated to encourage more explicit feedback from a human trainer in areas of the state space where there is high uncertainty. Our results show that our novel adaptive speed agent dominates different fixed speed agents on several measures of performance. Additionally, we investigate the impact of instructions on user performance and user preference in training conditions.
more » « less
Full Text Available
A Need for Speed: Adapting Agent Action Speed to Improve Task Learning from Non-Expert Humans

Peng, Bei; MacGlashan, James; Loftin, Robert; Littman, Michael L.; Roberts, David L.; Taylor, Matthew E. (January 2016, AAMAS)

As robots become pervasive in human environments, it is important to enable users to effectively convey new skills without programming. Most existing work on Interactive Reinforcement Learning focuses on interpreting and incorporating non-expert human feedback to speed up learning; we aim to design a better representation of the learning agent that is able to elicit more natural and effective communication between the human trainer and the learner, while treating human feedback as discrete communication that depends probabilistically on the trainer’s target policy. This work entails a user study where participants train a virtual agent to accomplish tasks by giving reward and/or punishment in a variety of simulated environments. We present results from 60 participants to show how a learner can ground natural language commands and adapt its action execution speed to learn more efficiently from human trainers. The agent’s action execution speed can be successfully modulated to encourage more explicit feedback from a human trainer in areas of the state space where there is high uncertainty. Our results show that our novel adaptive speed agent dominates different fixed speed agents on several measures of performance. Additionally, we investigate the impact of instructions on user performance and user preference in training conditions.
more » « less
Full Text Available
Learning behaviors via human-delivered discrete feedback: modeling implicit feedback strategies to speed up learning

https://doi.org/10.1007/s10458-015-9283-7

Loftin, Robert; Peng, Bei; MacGlashan, James; Littman, Michael L.; Taylor, Matthew E.; Huang, Jeff; Roberts, David L. (January 2016, Autonomous Agents and Multi-Agent Systems)

Full Text Available

Search for: All records