NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Meta-Learning Parameterized Skills

Fu, Hoatian; Yu, Shangqun; Tiwari, Saket; Littman, Michael; Konidaris, George (July 2023, Proceedings of the Fortieth International Conference on Machine Learning)

We propose a novel parameterized skill-learning algorithm that aims to learn transferable parameterized skills and synthesize them into a new action space that supports efficient learning in long-horizon tasks. We propose to leverage off-policy Meta-RL combined with a trajectory-centric smoothness term to learn a set of parameterized skills. Our agent can use these learned skills to construct a three-level hierarchical framework that models a Temporally-extended Parameterized Action Markov Decision Process. We empirically demonstrate that the proposed algorithms enable an agent to solve a set of difficult long-horizon (obstacle-course and robot manipulation) tasks.
more » « less
Full Text Available
Helping Users Debug Trigger-Action Programs

Zhang, Lefan; Zhou, Cyrus; Littman, Michael L.; Ur, Blase; Lu, Shan (December 2022, Proceedings of the ACM on interactive mobile wearable and ubiquitous technologies)

Trigger-action programming (TAP) empowers a wide array of users to automate Internet of Things (IoT) devices. However, it can be challenging for users to create completely correct trigger-action programs (TAPs) on the first try, necessitating debugging. While TAP has received substantial research attention, TAP debugging has not. In this paper, we present the first empirical study of users’ end-to-end TAP debugging process, focusing on obstacles users face in debugging TAPs and how well users ultimately fix incorrect automations. To enable this study, we added TAP capabilities to an existing 3-D smart home simulator. Thirty remote participants spent a total of 84 hours debugging TAPs using this simulator. Without additional support, participants were often unable to fix buggy TAPs due to a series of obstacles we document. However, we also found that two novel tools we developed helped participants overcome many of these obstacles and more successfully debug TAPs. These tools collect either implicit or explicit feedback from users about automations that should or should not have happened in the past, using a SAT-solving-based algorithm we developed to automatically modify the TAPs to account for this feedback.
more » « less
Full Text Available
On the (In)Tractability of Reinforcement Learning for LTL Objectives

https://doi.org/10.24963/ijcai.2022/507

Yang, Cambridge; Littman, Michael L.; Carbin, Michael (July 2022, Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI-22))

Full Text Available
Collusion rings threaten the integrity of computer science research

https://doi.org/10.1145/3429776

Littman, Michael L. (June 2021, Communications of the ACM)

Experiences discovering attempts to subvert the peer-review process.
more » « less
Full Text Available
Supporting End Users in Defining Reinforcement-Learning Problems for Human-Robot Interactions (Extended Abstract)

Zhao, Valerie; Littman, Michael L.; Lu, Shan; Sebo, Sarah; Ur, Blase (January 2022, The 5th Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM))

Reinforcement learning (RL) can help agents learn complex tasks that would be hard to specify using standard imperative programming. However, end users may have trouble personalizing their technology using RL due to a lack of technical expertise. Prior work has explored means of supporting end users after a problem for the RL agent to solve has been defined. Little work, however, has explored how to support end users when defining this problem. We propose a tool to provide structured support for end users defining problems for RL agents. Through this tool, users can (i) directly and indirectly specify the problem as a Markov decision process (MDP); (ii) receive automatic suggestions on possible MDP changes that would enhance training time and accuracy; and (iii) revise the MDP after training the agent to solve it. We believe this work will help reduce barriers to using RL and contribute to the existing literature on designing human-in-the-loop systems.
more » « less
Full Text Available
Communication in action: Planning and interpreting communicative demonstrations.

https://doi.org/10.1037/xge0001035

Ho, Mark K.; Cushman, Fiery; Littman, Michael L.; Austerweil, Joseph L. (November 2021, Journal of Experimental Psychology: General)

Full Text Available
Explaining Why: How Instructions and User Interfaces Impact Annotator Rationales When Labeling Text Data

https://doi.org/10.18653/v1/2022.naacl-main.38

Sullivan Jr., Jamar; Brackenbury, Will; McNutt, Andrew; Bryson, Kevin; Byll, Kwam; Chen, Yuxin; Littman, Michael; Tan, Chenhao; Ur, Blase (January 2022, Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies)

Full Text Available
Understanding Trigger-Action Programs Through Novel Visualizations of Program Differences

https://doi.org/10.1145/3411764.3445567

Zhao, Valerie; Zhang, Lefan; Wang, Bo; Littman, Michael L.; Lu, Shan; Ur, Blase (January 2021, Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI '21))
null (Ed.)
Trigger-action programming (if-this-then-that rules) empowers non-technical users to automate services and smart devices. As a user's set of trigger-action programs evolves, the user must reason about behavior differences between similar programs, such as between an original program and several modification candidates, to select programs that meet their goals. To facilitate this process, we co-designed user interfaces and underlying algorithms to highlight differences between trigger-action programs. Our novel approaches leverage formal methods to efficiently identify and visualize differences in program outcomes or abstract properties. We also implemented a traditional interface that shows only syntax differences in the rules themselves. In a between-subjects online experiment with 107 participants, the novel interfaces better enabled participants to select trigger-action programs matching intended goals in complex, yet realistic, situations that proved very difficult when using traditional interfaces showing syntax differences.
more » « less
Full Text Available
On the Expressivity of Markov Reward

Abel, David; Dabney, Will; Harutyunyan, Anna; Ho, Mark; Littman, Michael; Precup, Doina; Singh, Satinder (January 2021, Neural Information Processing Systems)

Reward is the driving force for reinforcement-learning agents. This paper is dedicated to understanding the expressivity of reward as a way to capture tasks that we would want an agent to perform. We frame this study around three new abstract notions of “task” that might be desirable: (1) a set of acceptable behaviors, (2) a partial ordering over behaviors, or (3) a partial ordering over trajectories. Our main results prove that while reward can express many of these tasks, there exist instances of each task type that no Markov reward function can capture. We then provide a set of polynomial-time algorithms that construct a Markov reward function that allows an agent to optimize tasks of each of these three types, and correctly determine when no such reward function exists. We conclude with an empirical study that corroborates and illustrates our theoretical findings.
more » « less
Full Text Available
Applying prerequisite structure inference to adaptive testing

https://doi.org/10.1145/3375462.3375541

Saarinen, Sam; Cater, Evan; Littman, Michael L. (March 2020, Learning Analytics & Knowledge Conference)

Modeling student knowledge is important for assessment design, adaptive testing, curriculum design, and pedagogical intervention. The assessment design community has primarily focused on continuous latent-skill models with strong conditional independence assumptions among knowledge items, while the prerequisite discovery community has developed many models that aim to exploit the interdependence of discrete knowledge items. This paper attempts to bridge the gap by asking, "When does modeling assessment item interdependence improve predictive accuracy?" A novel adaptive testing evaluation framework is introduced that is amenable to techniques from both communities, and an efficient algorithm, Directed Item-Dependence And Confidence Thresholds (DIDACT), is introduced and compared with an Item-Response-Theory based model on several real and synthetic datasets. Experiments suggest that assessments with closely related questions benefit significantly from modeling item interdependence.
more » « less
Full Text Available

« Prev Next »

Search for: All records