skip to main content


This content will become publicly available on May 16, 2024

Title: The blindfolded traveler’s problem: A search framework for motion planning with contact estimates

We address the problem of robot motion planning under uncertainty where the only observations are through contact with the environment. Such problems are typically solved by planning optimistically assuming unknown space is free, moving along the planned path and re-planning if the robot collides. However this approach can be very inefficient, leading to many unnecessary collisions and unproductive motion. We propose a new formulation, the Blindfolded Traveler’s Problem (BTP), for planning on a graph containing edges with unknown validity, with true validity observed only through attempted traversal by the robot. The solution to a BTP is a policy indicating the next edge to attempt given previous observations and an initial belief. We prove that BTP is NP-complete and show that exact modeling of the belief is intractable, therefore we present several approximation-based policies and beliefs. For the policy we propose graph search with edge weights augmented by the probability of collision. For the belief representation we propose a weighted Mixture of Experts of Collision Hypothesis Sets and a Manifold Particle Filter. Empirical evaluation in simulation and on a real robot arm shows that our proposed approach vastly outperforms several baselines as well as a previous approach that does not employ the BTP framework.

 
more » « less
NSF-PAR ID:
10413948
Author(s) / Creator(s):
 ;  ;  ;  
Publisher / Repository:
SAGE Publications
Date Published:
Journal Name:
The International Journal of Robotics Research
Volume:
42
Issue:
4-5
ISSN:
0278-3649
Page Range / eLocation ID:
p. 289-309
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Performing robust goal-directed manipulation tasks remains a crucial challenge for autonomous robots. In an ideal case, shared autonomous control of manipulators would allow human users to specify their intent as a goal state and have the robot reason over the actions and motions to achieve this goal. However, realizing this goal remains elusive due to the problem of perceiving the robot’s environment. We address and describe the problem of axiomatic scene estimation for robot manipulation in cluttered scenes which is the estimation of a tree-structured scene graph describing the configuration of objects observed from robot sensing. We propose generative approaches to scene inference (as the axiomatic particle filter, and the axiomatic scene estimation by Markov chain Monte Carlo based sampler) of the robot’s environment as a scene graph. The result from AxScEs estimation are axioms amenable to goal-directed manipulation through symbolic inference for task planning and collision-free motion planning and execution. We demonstrate the results for goal-directed manipulation of multi-object scenes by a PR2 robot. 
    more » « less
  2. null (Ed.)
    The TEB hierarchical planner for real-time navigation through unknown environments is highly effective at balancing collision avoidance with goal directed motion. Designed over several years and publications, it implements a multi-trajectory optimization based synthesis method for identifying topologically distinct trajectory candidates through navigable space. Unfortunately, the underlying factor graph approach to the optimization problem induces a mismatch between grid-based representations and the optimization graph, which leads to several time and optimization inefficiencies. This paper explores the impact of using egocentric, perception space representations for the local planning map. Doing so alleviates many of the identified issues related to TEB and leads to a new method called egoTEB. Timing experiments and Monte Carlo evaluations in benchmark worlds quantify the benefits of egoTEB for navigation through uncertain environments. 
    more » « less
  3. Contemporary approaches to perception, planning, estimation, and control have allowed robots to operate robustly as our remote surrogates in uncertain, unstructured environments. This progress now creates an opportunity for robots to operate not only in isolation, but also with and alongside humans in our complex environments. Realizing this opportunity requires an efficient and flexible medium through which humans can communicate with collaborative robots. Natural language provides one such medium, and through significant progress in statistical methods for natural-language understanding, robots are now able to interpret a diverse array of free-form navigation, manipulation, and mobile-manipulation commands. However, most contemporary approaches require a detailed, prior spatial-semantic map of the robot’s environment that models the space of possible referents of an utterance. Consequently, these methods fail when robots are deployed in new, previously unknown, or partially-observed environments, particularly when mental models of the environment differ between the human operator and the robot. This paper provides a comprehensive description of a novel learning framework that allows field and service robots to interpret and correctly execute natural-language instructions in a priori unknown, unstructured environments. Integral to our approach is its use of language as a “sensor”—inferring spatial, topological, and semantic information implicit in natural-language utterances and then exploiting this information to learn a distribution over a latent environment model. We incorporate this distribution in a probabilistic, language grounding model and infer a distribution over a symbolic representation of the robot’s action space, consistent with the utterance. We use imitation learning to identify a belief-space policy that reasons over the environment and behavior distributions. We evaluate our framework through a variety of different navigation and mobile-manipulation experiments involving an unmanned ground vehicle, a robotic wheelchair, and a mobile manipulator, demonstrating that the algorithm can follow natural-language instructions without prior knowledge of the environment. 
    more » « less
  4. We present a closed-loop multi-arm motion planner that is scalable and flexible with team size. Traditional multi-arm robotic systems have relied on centralized motion planners, whose run times often scale exponentially with team size, and thus, fail to handle dynamic environments with open-loop control. In this paper, we tackle this problem with multi-agent reinforcement learning, where a shared policy network is trained to control each individual robot arm to reach its target end-effector pose given observations of its workspace state and target end-effector pose. The policy is trained using Soft Actor-Critic with expert demonstrations from a sampling-based motion planning algorithm (i.e., BiRRT). By leveraging classical planning algorithms, we can improve the learning efficiency of the reinforcement learning algorithm while retaining the fast inference time of neural networks. The resulting policy scales sub-linearly and can be deployed on multi-arm systems with variable team sizes. Thanks to the closed-loop and decentralized formulation, our approach generalizes to 5-10 multiarm systems and dynamic moving targets (>90% success rate for a 10-arm system), despite being trained on only 1-4 arm planning tasks with static targets. 
    more » « less
  5. null (Ed.)
    We study the following problem, which to our knowledge has been addressed only partially in the literature and not in full generality. An agent observes two players play a zero-sum game that is known to the players but not the agent. The agent observes the actions and state transitions of their game play, but not rewards. The players may play either op-timally (according to some Nash equilibrium) or according to any other solution concept, such as a quantal response equilibrium. Following these observations, the agent must recommend a policy for one player, say Player 1. The goal is to recommend a policy that is minimally exploitable un-der the true, but unknown, game. We take a Bayesian ap-proach. We establish a likelihood function based on obser-vations and the specified solution concept. We then propose an approach based on Markov chain Monte Carlo (MCMC), which allows us to approximately sample games from the agent’s posterior belief distribution. Once we have a batch of independent samples from the posterior, we use linear pro-gramming and backward induction to compute a policy for Player 1 that minimizes the sum of exploitabilities over these games. This approximates the policy that minimizes the ex-pected exploitability under the full distribution. Our approach is also capable of handling counterfactuals, where known modifications are applied to the unknown game. We show that our Bayesian MCMC-based technique outperforms two other techniques—one based on the equilibrium policy of the maximum-probability game and the other based on imitation of observed behavior—on all the tested stochastic game envi-ronments. 
    more » « less