NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Invariance Through Latent Alignment

https://doi.org/10.15607/RSS.2022.XVIII.064

Yoneda, Takuma; Yang, Ge; Walter, Matthew R.; Stadie, Bradly C. (June 2022, Proceedings of Robotics: Science and Systems (RSS))

A robot’s deployment environment often involves perceptual changes that differ from what it has experienced during training. Standard practices such as data augmentation attempt to bridge this gap by augmenting source images in an effort to extend the support of the training distribution to better cover what the agent might experience at test time. In many cases, however, it is impossible to know test-time distribution- shift a priori, making these schemes infeasible. In this paper, we introduce a general approach, called Invariance through Latent Alignment (ILA), that improves the test-time performance of a visuomotor control policy in deployment environments with unknown perceptual variations. ILA performs unsupervised adaptation at deployment-time by matching the distribution of latent features on the target domain to the agent’s prior experience, without relying on paired data. Although simple, we show that this idea leads to surprising improvements on a variety of challenging adaptation scenarios, including changes in lighting conditions, the content in the scene, and camera poses. We present results on calibrated control benchmarks in simulation—the distractor control suite—and a physical robot under a sim-to-real setup. Video and code available at: https: //invariance-through-latent-alignment.github.io
more » « less
Full Text Available
Self-Supervised Camera Self-Calibration from Video

https://doi.org/10.1109/ICRA46639.2022.9811784

Fang, Jiading; Vasiljevic, Igor; Guizilini, Vitor; Ambrus, Rares; Shakhnarovich, Greg; Gaidon, Adrien; Walter, Matthew R (May 2022, IEEE)

Camera calibration is integral to robotics and computer vision algorithms that seek to infer geometric properties of the scene from visual input streams. In practice, calibration is a laborious procedure requiring specialized data collection and careful tuning. This process must be repeated whenever the parameters of the camera change, which can be a frequent occurrence for mobile robots and autonomous vehicles. In contrast, self-supervised depth and ego-motion estimation approaches can bypass explicit calibration by in-ferring per-frame projection models that optimize a view-synthesis objective. In this paper, we extend this approach to explicitly calibrate a wide range of cameras from raw videos in the wild. We propose a learning algorithm to regress per-sequence calibration parameters using an efficient family of general camera models. Our procedure achieves self-calibration results with sub-pixel reprojection error, outperforming other learning-based methods. We validate our approach on a wide variety of camera geometries, including perspective, fisheye, and catadioptric. Finally, we show that our approach leads to improvements in the downstream task of depth estimation, achieving state-of-the-art results on the EuRoC dataset with greater computational efficiency than contemporary methods.
more » « less
Full Text Available
Residual Policy Learning for Shared Autonomy

Schaff, Charles; Walter, Matthew R. (July 2020, Proceedings of Robotics: Science and Systems (RSS))

Shared autonomy provides an effective framework for human-robot collaboration that takes advantage of the complementary strengths of humans and robots to achieve common goals. Many existing approaches to shared autonomy make restrictive assumptions that the goal space, environment dynamics, or human policy are known a priori, or are limited to discrete action spaces, preventing those methods from scaling to complicated real world environments. We propose a model-free, residual policy learning algorithm for shared autonomy that alleviates the need for these assumptions. Our agents are trained to minimally adjust the human’s actions such that a set of goal-agnostic constraints are satisfied. We test our method in two continuous control environments: Lunar Lander, a 2D flight control domain, and a 6-DOF quadrotor reaching task. In experiments with human and surrogate pilots, our method significantly improves task performance without any knowledge of the human’s goal beyond the constraints. These results highlight the ability of model-free deep reinforcement learning to realize assistive agents suited to continuous control settings with little knowledge of user intent.
more » « less
Full Text Available
Multigrid Neural Memory

Huynh, Tri; Maire, Michael; Walter, Matthew R. (July 2020, Proceedings of the International Conference on Machine Learning (ICML))

We introduce a novel approach to endowing neural networks with emergent, long-term, large-scale memory. Distinct from strategies that connect neural networks to external memory banks via intricately crafted controllers and hand-designed attentional mechanisms, our memory is internal, distributed, co-located alongside computation, and implicitly addressed, while being drastically simpler than prior efforts. Architecting networks with multigrid structure and connectivity, while distributing memory cells alongside computation throughout this topology, we observe the emergence of coherent memory subsystems. Our hierarchical spatial organization, parameterized convolutionally, permits efficient instantiation of large-capacity memories, while multigrid topology provides short internal routing pathways, allowing convolutional networks to efficiently approximate the behavior of fully connected networks. Such networks have an implicit capacity for internal attention; augmented with memory, they learn to read and write specific memory locations in a dynamic data-dependent manner. We demonstrate these capabilities on exploration and mapping tasks, where our network is able to self-organize and retain long-term memory for trajectories of thousands of time steps. On tasks decoupled from any notion of spatial geometry: sorting, associative recall, and question answering, our design functions as a truly generic memory and yields excellent results.
more » « less
Full Text Available
Language-guided Semantic Mapping and Mobile Manipulation in Partially Observable Environments

Patki, Siddharth; Fahnestock, Ethan; Howard, Thomas M.; Walter, Matthew R. (October 2020, Proceedings of Machine Learning Research)

Recent advances in data-driven models for grounded language understanding have enabled robots to interpret increasingly complex instructions. Two fundamental limitations of these methods are that most require a full model of the environment to be known a priori, and they attempt to reason over a world representation that is flat and unnecessarily detailed, which limits scalability. Recent semantic mapping methods address partial observability by exploiting language as a sensor to infer a distribution over topological, metric and semantic properties of the environment. However, maintaining a distribution over highly detailed maps that can support grounding of diverse instructions is computationally expensive and hinders real-time human-robot collaboration. We propose a novel framework that learns to adapt perception according to the task in order to maintain compact distributions over semantic maps. Experiments with a mobile manipulator demonstrate more efficient instruction following in a priori unknown environments.
more » « less
Full Text Available
Benchmarking Structured Policies and Policy Optimization for Real-World Dexterous Object Manipulation

https://doi.org/10.1109/LRA.2021.3129139

Funk, Niklas; Schaff, Charles; Madan, Rishabh; Yoneda, Takuma; De Jesus, Julen Urain; Watson, Joe; Gordon, Ethan K.; Widmaier, Felix; Bauer, Stefan; Srinivasa, Siddhartha S.; et al (January 2022, IEEE Robotics and Automation Letters)

Full Text Available
Maximum Expected Hitting Cost of a Markov Decision Process and Informativeness of Rewards

Dai, Falcon Z.; Walter, Matthew R. (December 2019, Advances in neural information processing systems)

We propose a new complexity measure for Markov decision processes (MDPs), the maximum expected hitting cost (MEHC). This measure tightens the closely related notion of diameter by accounting for the reward structure. We show that this parameter replaces diameter in the upper bound on the optimal value span of an extended MDP, thus refining the associated upper bounds on the regret of several UCRL2-like algorithms. Furthermore, we show that potential-based reward shaping can induce equivalent reward functions with varying informativeness, as measured by MEHC. We further establish that shaping can reduce or increase MEHC by at most a factor of two in a large class of MDPs with finite MEHC and unsaturated optimal average rewards.
more » « less
Full Text Available
Language-guided Semantic Mapping and Mobile Manipulation in Partially Observable Environments

Patki, Siddharth; Fahnestock, Ethan; Howard, Thomas M.; Walter, Matthew R. (November 2019, Conference on Robot Learning (CoRL))

Full Text Available
Jointly Learning to Construct and Control Agents using Deep Reinforcement Learning

https://doi.org/10.1109/ICRA.2019.8793537

Schaff, Charles; Yunis, David; Chakrabarti, Ayan; Walter, Matthew R. (May 2019, Proceedings of the International Conference on Robotics and Automation (ICRA))

The physical design of a robot and the policy that controls its motion are inherently coupled, and should be determined according to the task and environment. In an increasing number of applications, data-driven and learning-based approaches, such as deep reinforcement learning, have proven effective at designing control policies. For most tasks, the only way to evaluate a physical design with respect to such control policies is empirical---i.e., by picking a design and training a control policy for it. Since training these policies is time-consuming, it is computationally infeasible to train separate policies for all possible designs as a means to identify the best one. In this work, we address this limitation by introducing a method that jointly optimizes over the physical design and control network. Our approach maintains a distribution over designs and uses reinforcement learning to optimize a control policy to maximize expected reward over the design distribution. We give the controller access to design parameters to allow it to tailor its policy to each design in the distribution. Throughout training, we shift the distribution towards higher-performing designs, eventually converging to a design and control policy that are jointly optimal. We evaluate our approach in the context of legged locomotion, and demonstrate that it discovers novel designs and walking gaits, outperforming baselines across different settings.
more » « less
Full Text Available
Integrated Benchmarking and Design for Reproducible and Accessible Evaluation of Robotic Agents

https://doi.org/10.1109/IROS45743.2020.9341677

Tani, Jacopo; Daniele, Andrea F.; Bernasconi, Gianmarco; Camus, Amaury; Petrov, Aleksandar; Courchesne, Anthony; Mehta, Bhairav; Suri, Rohit; Zaluska, Tomasz; Walter, Matthew R.; et al (October 2020, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS))

Full Text Available

« Prev Next »

Search for: All records