NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

RoboPianist: Dexterous Piano Playing with Deep Reinforcement Learning

Zakka, Kevin; Wu, Philipp; Smith, Laura; Gileadi, Nimrod; Howell, Taylor; Peng, Xue Bin; Singh, Sumeet; Tassa, Yuval; Florence, Pete; Zeng, Andy; et al (November 2023, Conference on Robot Learning)

Replicating human-like dexterity in robot hands represents one of the largest open problems in robotics. Reinforcement learning is a promising approach that has achieved impressive progress in the last few years; however, the class of problems it has typically addressed corresponds to a rather narrow definition of dexterity as compared to human capabilities. To address this gap, we investigate piano-playing, a skill that challenges even the human limits of dexterity, as a means to test high-dimensional control, and which requires high spatial and temporal precision, and complex finger coordination and planning. We introduce RoboPianist, a system that enables simulated anthropomorphic hands to learn an extensive repertoire of 150 piano pieces where traditional model-based optimization struggles. We additionally introduce an open-sourced environment, benchmark of tasks, interpretable evaluation metrics, and open challenges for future study.
more » « less
Full Text Available
Masked Autoencoding for Scalable and Generalizable Decision Making

Liu, Fangchen; Liu, Hao; Grover, Aditya; Abbeel, Pieter (January 2023, Advances in neural information processing systems)

Full Text Available
Causal Confusion and Reward Misidentification in Preference-Based Reward Learning

J. Tien, D. Brown (January 2023, International Conference on Learning Representations)

Full Text Available
Masked Trajectory Models for Prediction, Representation, and Control

Wu, Philipp; Majumdar, Arjun; Stone, Kevin; Lin, Yixin; Mordatch, Igor; Abbeel, Pieter; Rajeswaran, Aravind (January 2023, International Conference on Machine Learning)

Full Text Available
Similarity-based Implicit Representation Learning

A. Bobu, Y. Liu (January 2023, Human-Robot Interaction)

Full Text Available
Masked World Models for Visual Control

Seo, Younggyo; Hafner, Danijar; Liu, Hao; Liu, Fangchen; James, Stephen; Lee, Kimin; Abbeel, Pieter (January 2022, Conference on Robot Learning)

Full Text Available
Learning Representations that Enable Generalization in Assistive Tasks.

J.Z.Y. He, A. Raghunathan (January 2022, Conference on Robot Learning)

Full Text Available
DayDreamer: World Models for Physical Robot Learning

Wu, Philipp; Escontrela, Alejandro; Hafner, Danijar; Goldberg, Ken; Abbeel, Pieter (January 2022, Conference on Robot Learning)

Full Text Available
Decision Transformer: Reinforcement Learning via Sequence Modeling

Chen, Lili; Lu, Kevin; Rajeswaran, Aravind; Lee, Kimin; Grover, Aditya; Laskin, Michael; Abbeel, Pieter; Srinivas, Aravind; Mordatch, Igor (December 2021, Advances in neural information processing systems)
null (Ed.)
We introduce a framework that abstracts Reinforcement Learning (RL) as a sequence modeling problem. This allows us to draw upon the simplicity and scalability of the Transformer architecture, and associated advances in language modeling such as GPT-x and BERT. In particular, we present Decision Transformer, an architecture that casts the problem of RL as conditional sequence modeling. Unlike prior approaches to RL that fit value functions or compute policy gradients, Decision Transformer simply outputs the optimal actions by leveraging a causally masked Transformer. By conditioning an autoregressive model on the desired return (reward), past states, and actions, our Decision Transformer model can generate future actions that achieve the desired return. Despite its simplicity, Decision Transformer matches or exceeds the performance of state-of-the-art model-free offline RL baselines on Atari, OpenAI Gym, and Key-to-Door tasks.
more » « less
Full Text Available
Improving Computational Efficiency in Visual Reinforcement Learning via Stored Embeddings

Chen, Lili; Lee, Kimin; Srinivas, Aravind; Abbeel, Pieter (December 2021, Advances in neural information processing systems)
null (Ed.)
Recent advances in off-policy deep reinforcement learning (RL) have led to impressive success in complex tasks from visual observations. Experience replay improves sample-efficiency by reusing experiences from the past, and convolutional neural networks (CNNs) process high-dimensional inputs effectively. However, such techniques demand high memory and computational bandwidth. In this paper, we present Stored Embeddings for Efficient Reinforcement Learning (SEER), a simple modification of existing off-policy RL methods, to address these computational and memory requirements. To reduce the computational overhead of gradient updates in CNNs, we freeze the lower layers of CNN encoders early in training due to early convergence of their parameters. Additionally, we reduce memory requirements by storing the low-dimensional latent vectors for experience replay instead of high-dimensional images, enabling an adaptive increase in the replay buffer capacity, a useful technique in constrained-memory settings. In our experiments, we show that SEER does not degrade the performance of RL agents while significantly saving computation and memory across a diverse set of DeepMind Control environments and Atari games. Finally, we show that SEER is useful for computation-efficient transfer learning in RL because lower layers of CNNs extract generalizable features, which can be used for different tasks and domains.
more » « less
Full Text Available

« Prev Next »

Search for: All records