skip to main content


Title: Opportunistic Multi-Robot Environmental Sampling via Decentralized Markov Decision Processes
We study the problem of information sampling with a group of mobile robots from an unknown environment. Each robot is given a unique region in the environment for the sampling task. The objective of the robots is to visit a subset of locations in the environment such that the collected information is maximized, and consequently, the underlying information model matches as close to reality as possible. The robots have limited communication ranges, and therefore can only communicate when nearby one another. The robots operate in a stochastic environment and their control uncertainty is handled using factored Decentralized Markov Decision Processes (Dec-MDP). When two or more robots communicate, they share their past noisy observations and use a Gaussian mixture model to update their local information models. This in turn helps them to obtain a better Dec-MDP policy. Simulation results show that our proposed strategy is able to predict the information model closer to the ground truth version than compared to other algorithms. Furthermore, the reduction in the overall uncertainty is more than comparable algorithms.  more » « less
Award ID(s):
1849291
NSF-PAR ID:
10333064
Author(s) / Creator(s):
Date Published:
Journal Name:
International Symposium Distributed Autonomous Robotic Systems
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We examine a novel setting in which two parties have partial knowledge of the elements that make up a Markov Decision Process (MDP) and must cooperate to compute and execute an optimal policy for the problem constructed from those elements. This situation arises when one party wants to give a robot some task, but does not wish to divulge those details to a second party-while the second party possesses sensitive data about the robot's dynamics (information needed for planning). Both parties want the robot to perform the task successfully, but neither is willing to disclose any more information than is absolutely necessary. We utilize techniques from secure multi-party computation, combining primitives and algorithms to construct protocols that can compute an optimal policy while ensuring that the policy remains opaque by being split across both parties. To execute a split policy, we also give a protocol that enables the robot to determine what actions to trigger, while the second party guards against attempts to probe for information inconsistent with the policy's prescribed execution. In order to improve scalability, we find that basis functions and constraint sampling methods are useful in forming effective approximate MDPs. We report simulation results examining performance and precision, and assess the scaling properties of our Python implementation. We also describe a hardware proof-of-feasibility implementation using inexpensive physical robots, which, being a small-scale instance, can be solved directly. 
    more » « less
  2. We consider a planning problem for a robot operating in an information-degraded environment. Our contribution to the state of the art is addressing this problem when robots have limited sensing capabilities, and thus only acquire information in certain locations. We therefore need a method that balances between driving the robot to the goal and toward regions to gain information (or to reduce uncertainty). We present a novel sampling-based planner (Particle Filter based Affine Quadratic Tree --- PF-AQT) that explores the environment, and plans to reach a goal with minimal uncertainty. We then use the output trajectory from PF-AQT to initialize an optimization-based planner that finds a locally optimal trajectory that minimizes control effort and uncertainty. In doing so we reap the exploration benefits of sampling-based methods and exploitation benefits of optimization-based methods for dealing with uncertainty and limited sensing capabilities of the robot. We demonstrate our results using two dynamical systems: double integrator model and a non-holonomic car-like robot. 
    more » « less
  3. Accurately tracking dynamic targets relies on robots accounting for uncertainties in their own states to share information and maintain safety. The problem becomes even more challenging when there are an unknown and time-varying number of targets in the environment. In this paper we address this problem by introducing four new distributed algorithms that allow large teams of robots to: i) run the prediction and ii) update steps of a distributed recursive Bayesian multitarget tracker, iii) determine the set of local neighbors that must exchange data, and iv) exchange data in a consistent manner. All of these algorithms account for a bounded level of localization uncertainty in the robots by leveraging our recent introduction of the convex uncertainty Voronoi (CUV) diagram, which extends the traditional Voronoi diagram to account for localization uncertainty. The CUV diagram introduces a tessellation over the environment, which we use in this work both to distribute the multi-target tracker and to make control decisions about where to search next. We examine the efficacy of our method via a series of simulations and compare them to our previous work which assumed perfect localization. 
    more » « less
  4. Contemporary approaches to perception, planning, estimation, and control have allowed robots to operate robustly as our remote surrogates in uncertain, unstructured environments. This progress now creates an opportunity for robots to operate not only in isolation, but also with and alongside humans in our complex environments. Realizing this opportunity requires an efficient and flexible medium through which humans can communicate with collaborative robots. Natural language provides one such medium, and through significant progress in statistical methods for natural-language understanding, robots are now able to interpret a diverse array of free-form navigation, manipulation, and mobile-manipulation commands. However, most contemporary approaches require a detailed, prior spatial-semantic map of the robot’s environment that models the space of possible referents of an utterance. Consequently, these methods fail when robots are deployed in new, previously unknown, or partially-observed environments, particularly when mental models of the environment differ between the human operator and the robot. This paper provides a comprehensive description of a novel learning framework that allows field and service robots to interpret and correctly execute natural-language instructions in a priori unknown, unstructured environments. Integral to our approach is its use of language as a “sensor”—inferring spatial, topological, and semantic information implicit in natural-language utterances and then exploiting this information to learn a distribution over a latent environment model. We incorporate this distribution in a probabilistic, language grounding model and infer a distribution over a symbolic representation of the robot’s action space, consistent with the utterance. We use imitation learning to identify a belief-space policy that reasons over the environment and behavior distributions. We evaluate our framework through a variety of different navigation and mobile-manipulation experiments involving an unmanned ground vehicle, a robotic wheelchair, and a mobile manipulator, demonstrating that the algorithm can follow natural-language instructions without prior knowledge of the environment. 
    more » « less
  5. Partially observable Markov decision processes (POMDPs) provide a flexible representation for real-world decision and control problems. However, POMDPs are notoriously difficult to solve, especially when the state and observation spaces are continuous or hybrid, which is often the case for physical systems. While recent online sampling-based POMDP algorithms that plan with observation likelihood weighting have shown practical effectiveness, a general theory characterizing the approximation error of the particle filtering techniques that these algorithms use has not previously been proposed. Our main contribution is bounding the error between any POMDP and its corresponding finite sample particle belief MDP (PB-MDP) approximation. This fundamental bridge between PB-MDPs and POMDPs allows us to adapt any sampling-based MDP algorithm to a POMDP by solving the corresponding particle belief MDP, thereby extending the convergence guarantees of the MDP algorithm to the POMDP. Practically, this is implemented by using the particle filter belief transition model as the generative model for the MDP solver. While this requires access to the observation density model from the POMDP, it only increases the transition sampling complexity of the MDP solver by a factor of O(C), where C is the number of particles. Thus, when combined with sparse sampling MDP algorithms, this approach can yield algorithms for POMDPs that have no direct theoretical dependence on the size of the state and observation spaces. In addition to our theoretical contribution, we perform five numerical experiments on benchmark POMDPs to demonstrate that a simple MDP algorithm adapted using PB-MDP approximation, Sparse-PFT, achieves performance competitive with other leading continuous observation POMDP solvers.

     
    more » « less