NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Composable Interaction Primitives: A Structured Policy Class for Efficiently Learning Sustained-Contact Manipulation Skills

Abbatematteo, Ben; Rosen, Eric; Thompson, Skye; Akbulut, Tuluhan; Rammohan, Sreehari; Konidaris, George (May 2024, Proceedings of the 2024 IEEE Conference on Robotics and Automation)

We propose a new policy class, Composable Interaction Primitives (CIPs), specialized for learning sustained-contact manipulation skills like opening a drawer, pulling a lever, turning a wheel, or shifting gears. CIPs have two primary design goals: to minimize what must be learned by exploiting structure present in the world and the robot, and to support sequential composition by construction, so that learned skills can be used by a task-level planner. Using an ablation experiment in four simulated manipulation tasks, we show that the structure included in CIPs substantially improves the efficiency of motor skill learning. We then show that CIPs can be used for plan execution in a zero-shot fashion by sequencing learned skills.We validate our approach on real robot hardware by learning and sequencing two manipulation skills.
more » « less
Full Text Available
Effectively Learning Initiation Sets in Hierarchical Reinforcement Learning

Bagaria, Akhil; Abbatematteo, Ben; Gottesman, Omer; Corsaro, Matt; Rammohan, Sreehari; Konidaris, George (December 2023, 37th Conference on Neural Information Processing Systems)

An agent learning an option in hierarchical reinforcement learning must solve three problems: identify the option’s subgoal (termination condition), learn a policy, and learn where that policy will succeed (initiation set). The termination condition is typically identified first, but the option policy and initiation set must be learned simultaneously, which is challenging because the initiation set depends on the option policy, which changes as the agent learns. Consequently, data obtained from option execution becomes invalid over time, leading to an inaccurate initiation set that subsequently harms downstream task performance. We highlight three issues—data non-stationarity, temporal credit assignment, and pessimism—specific to learning initiation sets, and propose to address them using tools from off-policy value estimation and classification. We show that our method learns higher-quality initiation sets faster than existing methods (in MINIGRID and MONTEZUMA’S REVENGE), can automatically discover promising grasps for robot manipulation (in ROBOSUITE), and improves the performance of a state-of-the-art option discovery method in a challenging maze navigation task in MuJoCo.
more » « less
Full Text Available
Skill Generalization with Verbs

Ma, Rachel; Lam, Lyndon; Spiegel, Benjamin; Ganeshan, Aditya; Patel, Roma; Abbatematteo, Ben; Paulius, David Paulius; Tellex, Stefanie; Konidaris, George (October 2023, Proceedings of the 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems)

It is imperative that robots can understand natural language commands issued by humans. Such commands typically contain verbs that signify what action should be performed on a given object and that are applicable to many objects. We propose a method for generalizing manipulation skills to novel objects using verbs. Our method learns a probabilistic classifier that determines whether a given object trajectory can be described by a specific verb. We show that this classifier accurately generalizes to novel object categories with an average accuracy of 76.69% across 13 object categories and 14 verbs. We then perform policy search over the object kinematics to find an object trajectory that maximizes classifier prediction for a given verb. Our method allows a robot to generate a trajectory for a novel object based on a verb, which can then be used as input to a motion planner. We show that our model can generate trajectories that are usable for executing five verb commands applied to novel instances of two different object categories on a real robot.
more » « less
Full Text Available

Search for: All records