skip to main content


Title: Conditioning Style on Substance: Plans for Narrative Observation
We consider a robot tasked with observing its environment and later selectively summarizing what it saw as a vivid, structured narrative. The robot interacts with an uncertain environment, modelled as a stochastic process, and must decide what events to pay attention to (substance), and how to best make its recording (style) for later compilation of its summary. If carrying a video camera, for example, it must decide where to be, what to aim the camera at, and which stylistic selections, like the focus and level of zoom, are most suitable. This paper examines planning algorithms that help the robot predict events that (1) will likely occur; (2) would be useful in telling a tale; and (3) may be hewed to cohere stylistically. The third factor, a time-extended requirement, is entirely neglected in earlier, simpler work. With formulations based on underlying Markov Decision Processes, we compare two algorithms: a monolithic planner that jointly plans over events and style pairs and a decoupled approach that prescribes style conditioned on events. The decoupled approach is seen to be effective and much faster to compute, suggesting that computational expediency justifies the separation of substance from style. Finally, we also report on our hardware implementation.  more » « less
Award ID(s):
1849303 1849291 1849249
NSF-PAR ID:
10318014
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
2021 IEEE International Conference on Robotics and Automation (ICRA)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. One important class of applications entails a robot scrutinizing, monitoring, or recording the evolution of an uncertain time-extended process. This sort of situation leads to an interesting family of active perception problems that can be cast as planning problems in which the robot is limited in what it sees and must, thus, choose what to pay attention to. The distinguishing characteristic of this setting is that the robot has influence over what it captures via its sensors, but exercises no causal authority over the process evolving in the world. As such, the robot’s objective is to observe the underlying process and to produce a “chronicle” of occurrent events, subject to a goal specification of the sorts of event sequences that may be of interest. This paper examines variants of such problems in which the robot aims to collect sets of observations to meet a rich specification of their sequential structure. We study this class of problems by modeling a stochastic process via a variant of a hidden Markov model and specify the event sequences of interest as a regular language, developing a vocabulary of “mutators” that enable sophisticated requirements to be expressed. Under different suppositions on the information gleaned about the event model, we formulate and solve different planning problems. The core underlying idea is the construction of a product between the event model and a specification automaton. Using this product, we compute a policy that minimizes the expected number of steps to reach a goal state. We introduce a general algorithm for this problem as well as several more efficient algorithms for important special cases. The paper reports and compares performance metrics by drawing on some small case studies analyzed in depth via simulation. Specifically, we study the effect of the robot’s observation model on the average time required for the robot to record a desired story. We also compare our algorithm with a baseline greedy algorithm, showing that our algorithm outperforms the greedy algorithm in terms of the average time to record a desired story. In addition, experiments show that the algorithms tailored to specialized variants of the problem are rather more efficient than the general algorithm.

     
    more » « less
  2. Tan, Jie ; Toussaint, Marc ; Darvish, Kourosh (Ed.)
    Contacts play a critical role in most manipulation tasks. Robots today mainly use proximal touch/force sensors to sense contacts, but the information they provide must be calibrated and is inherently local, with practical applications relying either on extensive surface coverage or restrictive assumptions to resolve ambiguities. We propose a vision-based extrinsic contact localization task: with only a single RGB-D camera view of a robot workspace, identify when and where an object held by the robot contacts the rest of the environment. We show that careful task-attuned design is critical for a neural network trained in simulation to discover solutions that transfer well to a real robot. Our final approach im2contact demonstrates the promise of versatile general-purpose contact perception from vision alone, performing well for localizing various contact types (point, line, or planar; sticking, sliding, or rolling; single or multiple), and even under occlusions in its camera view. Video results can be found at: https://sites.google.com/view/im2contact/home 
    more » « less
  3. null (Ed.)
    As demands on manufacturing rapidly evolve, flexible manufacturing is becoming more essential for acquiring the necessary productivity to remain competitive. An innovative approach to flexible manufacturing is the introduction of fenceless robotic manufacturing cells to acquire and leverage greater human-robot collaboration (HRC). This involves operations in which a human and a robot share a space, complete tasks together, and interact with each other. Such operations, however, pose serious safety concerns. Before HRC can become a viable possibility, robots must be capable of safely operating within and responding to events in dynamic environments. Furthermore, the robot must be able to do this quickly during online operation. This paper outlines an algorithm for predictive collision detection. This algorithm gives the robot the ability to look ahead at its trajectory, and the trajectories of other bodies in its environment and predict potential collisions. The algorithm approximates a continuous swept volume of any articulated body along its trajectory by taking only a few time sequential samples of the predicted orientations of the body and creating surfaces that patch the orientations together with Coons patches. Run time data collected on this algorithm suggest that the algorithm can accurately predict future collisions in under 30 ms. 
    more » « less
  4. A robot can now grasp an object more effectively than ever before, but once it has the object what happens next? We show that a mild relaxation of the task and workspace constraints implicit in existing object grasping datasets can cause neural network based grasping algorithms to fail on even a simple block stacking task when executed under more realistic circumstances. To address this, we introduce the JHU CoSTAR Block Stacking Dataset (BSD), where a robot interacts with 5.1 cm colored blocks to complete an order-fulfillment style block stacking task. It contains dynamic scenes and real time-series data in a less constrained environment than comparable datasets. There are nearly 12,000 stacking attempts and over 2 million frames of real data. We discuss the ways in which this dataset provides a valuable resource for a broad range of other topics of investigation. We find that hand-designed neural networks that work on prior datasets do not generalize to this task. Thus, to establish a baseline for this dataset, we demonstrate an automated search of neural network based models using a novel multiple-input HyperTree MetaModel, and find a final model which makes reasonable 3D pose predictions for grasping and stacking on our dataset. The CoSTAR BSD, code, and instructions are available at sites.google.com/site/costardataset 
    more » « less
  5. We present a distributed Bayesian algorithm for robot swarms to classify a spatially distributed feature of an environment. This type of “go/no-go” decision appears in applications where a group of robots must collectively choose whether to take action, such as determining if a farm field should be treated for pests. Previous bio-inspired approaches to decentralized decision-making in robotics lack a statistical foundation, while decentralized Bayesian algorithms typically require a strongly connected network of robots. In contrast,our algorithm allows simple, sparsely distributed robots to quickly reach accurate decisions about a binary feature of their environment. We investigate the speed vs. accuracy tradeoff in decision-making by varying the algorithm’s parameters.We show that making fewer, less-correlated observations can improve decision-making accuracy, and that a well-chosen combination of prior and decision threshold allows for fast decisions with a small accuracy cost. Both speed and accuracy also improved with the addition of bio-inspired positive feed-back. This algorithm is also adaptable to the difficulty of the environment. Compared to a fixed-time benchmark algorithm with accuracy guarantees, our Bayesian approach resulted in equally accurate decisions, while adapting its decision time to the difficulty of the environment. 
    more » « less