Robot arms should be able to learn new tasks. One framework here is reinforcement learning, where the robot is given a reward function that encodes the task, and the robot autonomously learns actions to maximize its reward. Existing approaches to reinforcement learning often frame this problem as a Markov decision process, and learn a policy (or a hierarchy of policies) to complete the task. These policies reason over hundreds of fine-grained actions that the robot arm needs to take: e.g., moving slightly to the right or rotating the end-effector a few degrees. But the manipulation tasks that we want robots to perform can often be broken down into a small number of high-level motions: e.g., reaching an object or turning a handle. In this paper we therefore propose a waypoint-based approach for model-free reinforcement learning. Instead of learning a low-level policy, the robot now learns a trajectory of waypoints, and then interpolates between those waypoints using existing controllers. Our key novelty is framing this waypoint-based setting as a sequence of multi-armed bandits: each bandit problem corresponds to one waypoint along the robot’s motion. We theoretically show that an ideal solution to this reformulation has lower regret bounds than standard frameworks. We also introduce an approximate posterior sampling solution that builds the robot’s motion one waypoint at a time. Results across benchmark simulations and two real-world experiments suggest that this proposed approach learns new tasks more quickly than state-of-the-art baselines. See our website here: https://collab.me.vt.edu/rl-waypoints/ 
                        more » 
                        « less   
                    This content will become publicly available on June 1, 2026
                            
                            UniT: Data Efficient Tactile Representation With Generalization to Unseen Objects
                        
                    
    
            UniT is an approach to tactile representation learn¬ing, using VQGAN to learn a compact latent space and serve as the tactile representation. It uses tactile images obtained from a single simple object to train the representation with generalizability. This tactile representation can be zero-shot transferred to various downstream tasks, including perception tasks and manipulation policy learning. Our benchmarkings on in-hand 3D pose and 6D pose estimation tasks and a tactile classifcation task show that UniT outperforms existing visual and tactile representation learning methods. Additionally, UniT’s effectiveness in policy learning is demonstrated across three real-world tasks involving diverse manipulated objects and complex robot-object-environment interactions. Through extensive experi¬mentation, UniT is shown to be a simple-to-train, plug-and-play, yet widely effective method for tactile representation learning. For more details, please refer to our open-source repository https://github.com/ZhengtongXu/UniT and the project website https://zhengtongxu.github.io/unit-website/. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2423068
- PAR ID:
- 10599521
- Publisher / Repository:
- IEEE
- Date Published:
- Journal Name:
- IEEE Robotics and Automation Letters
- Volume:
- 10
- Issue:
- 6
- ISSN:
- 2377-3774
- Page Range / eLocation ID:
- 5481 to 5488
- Subject(s) / Keyword(s):
- Representation learning, force and tactile sensing, imitation learning.
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Learning a robot motor skill from scratch is impractically slow; so much so that in practice, learning must typically be bootstrapped using human demonstration. However, relying on human demonstration necessarily degrades the autonomy of robots that must learn a wide variety of skills over their operational lifetimes. We propose using kinematic motion planning as a completely autonomous, sample efficient way to bootstrap motor skill learning for object manipulation. We demonstrate the use of motion planners to bootstrap motor skills in two complex object manipulation scenarios with different policy representations: opening a drawer with a dynamic movement primitive representation, and closing a microwave door with a deep neural network policy. We also show how our method can bootstrap a motor skill for the challenging dynamic task of learning to hit a ball off a tee, where a kinematic plan based on treating the scene as static is insufficient to solve the task, but sufficient to bootstrap a more dynamic policy. In all three cases, our method is competitive with human-demonstrated initialization, and significantly outperforms starting with a random policy. This approach enables robots to to efficiently and autonomously learn motor policies for dynamic tasks without human demonstration.more » « less
- 
            Recognizing and generating object-state compositions has been a challenging task, especially when generalizing to unseen compositions. In this paper, we study the task of cutting objects in different styles and the resulting object state changes. We propose a new benchmark suite Chop & Learn, to accommodate the needs of learning objects and different cut styles using multiple viewpoints. We also propose a new task of Compositional Image Generation, which can transfer learned cut styles to different objects, by generating novel object-state images. Moreover, we also use the videos for Compositional Action Recognition, and show valuable uses of this dataset for multiple video tasks. Project website: https://chopnlearn.github.io.more » « less
- 
            null (Ed.)To perform manipulation tasks in the real world, robots need to operate on objects with various shapes, sizes and without access to geometric models. To achieve this it is often infeasible to train monolithic neural network policies across such large variations in object properties. Towards this generalization challenge, we propose to learn modular task policies which compose object-centric task-axes controllers. These task-axes controllers are parameterized by properties associated with underlying objects in the scene. We infer these controller parameters directly from visual input using multi- view dense correspondence learning. Our overall approach provides a simple and yet powerful framework for learning manipulation tasks. We empirically evaluate our approach on 3 different manipulation tasks and show its ability to generalize to large variance in object size, shape and geometry.more » « less
- 
            We present a deep learning method for composite and task-driven motion control for physically simulated characters. In contrast to existing data-driven approaches using reinforcement learning that imitate full-body motions, we learn decoupled motions for specific body parts from multiple reference motions simultaneously and directly by leveraging the use of multiple discriminators in a GAN-like setup. In this process, there is no need of any manual work to produce composite reference motions for learning. Instead, the control policy explores by itself how the composite motions can be combined automatically. We further account for multiple task-specific rewards and train a single, multi-objective control policy. To this end, we propose a novel framework for multi-objective learning that adaptively balances the learning of disparate motions from multiple sources and multiple goal-directed control objectives. In addition, as composite motions are typically augmentations of simpler behaviors, we introduce a sample-efficient method for training composite control policies in an incremental manner, where we reuse a pre-trained policy as the meta policy and train a cooperative policy that adapts the meta one for new composite tasks. We show the applicability of our approach on a variety of challenging multi-objective tasks involving both composite motion imitation and multiple goal-directed control. Code is available at https://motion-lab.github.io/CompositeMotion.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
