skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Conditioning Style on Substance: Plans for Narrative Observation
We consider a robot tasked with observing its environment and later selectively summarizing what it saw as a vivid, structured narrative. The robot interacts with an uncertain environment, modelled as a stochastic process, and must decide what events to pay attention to (substance), and how to best make its recording (style) for later compilation of its summary. If carrying a video camera, for example, it must decide where to be, what to aim the camera at, and which stylistic selections, like the focus and level of zoom, are most suitable. This paper examines planning algorithms that help the robot predict events that (1) will likely occur; (2) would be useful in telling a tale; and (3) may be hewed to cohere stylistically. The third factor, a time-extended requirement, is entirely neglected in earlier, simpler work. With formulations based on underlying Markov Decision Processes, we compare two algorithms: a monolithic planner that jointly plans over events and style pairs and a decoupled approach that prescribes style conditioned on events. The decoupled approach is seen to be effective and much faster to compute, suggesting that computational expediency justifies the separation of substance from style. Finally, we also report on our hardware implementation.  more » « less
Award ID(s):
1849303 1849291 1849249
PAR ID:
10318014
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
2021 IEEE International Conference on Robotics and Automation (ICRA)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Tan, Jie; Toussaint, Marc; Darvish, Kourosh (Ed.)
    Contacts play a critical role in most manipulation tasks. Robots today mainly use proximal touch/force sensors to sense contacts, but the information they provide must be calibrated and is inherently local, with practical applications relying either on extensive surface coverage or restrictive assumptions to resolve ambiguities. We propose a vision-based extrinsic contact localization task: with only a single RGB-D camera view of a robot workspace, identify when and where an object held by the robot contacts the rest of the environment. We show that careful task-attuned design is critical for a neural network trained in simulation to discover solutions that transfer well to a real robot. Our final approach im2contact demonstrates the promise of versatile general-purpose contact perception from vision alone, performing well for localizing various contact types (point, line, or planar; sticking, sliding, or rolling; single or multiple), and even under occlusions in its camera view. Video results can be found at: https://sites.google.com/view/im2contact/home 
    more » « less
  2. Mechanical search, the finding and extracting of a known target object from a cluttered environment, is a key challenge in automating warehouse, home, retail, and industrial tasks. In this paper, we consider contexts in which occluding objects are to remain untouched, thus minimizing disruptions and avoiding toppling. We assume a 6-DOF robot with an RGBD camera and unicontact suction gripper mounted on its wrist. With this setup, the robot can move both camera and gripper in order to identify a suitable approach vector, reach in to achieve a suction grasp of the target object, and extract it. We present AVPLUG: Approach Vector PLanning for Unicontact Grasping, an algorithm that uses an octree occupancy model and Minkowski sum computation to find a collision-free grasp approach vector. Experiments in simulation and with a physical Fetch robot suggest that AVPLUG finds an approach vector up to 20× faster than a baseline search policy. 
    more » « less
  3. One important class of applications entails a robot scrutinizing, monitoring, or recording the evolution of an uncertain time-extended process. This sort of situation leads to an interesting family of active perception problems that can be cast as planning problems in which the robot is limited in what it sees and must, thus, choose what to pay attention to. The distinguishing characteristic of this setting is that the robot has influence over what it captures via its sensors, but exercises no causal authority over the process evolving in the world. As such, the robot’s objective is to observe the underlying process and to produce a “chronicle” of occurrent events, subject to a goal specification of the sorts of event sequences that may be of interest. This paper examines variants of such problems in which the robot aims to collect sets of observations to meet a rich specification of their sequential structure. We study this class of problems by modeling a stochastic process via a variant of a hidden Markov model and specify the event sequences of interest as a regular language, developing a vocabulary of “mutators” that enable sophisticated requirements to be expressed. Under different suppositions on the information gleaned about the event model, we formulate and solve different planning problems. The core underlying idea is the construction of a product between the event model and a specification automaton. Using this product, we compute a policy that minimizes the expected number of steps to reach a goal state. We introduce a general algorithm for this problem as well as several more efficient algorithms for important special cases. The paper reports and compares performance metrics by drawing on some small case studies analyzed in depth via simulation. Specifically, we study the effect of the robot’s observation model on the average time required for the robot to record a desired story. We also compare our algorithm with a baseline greedy algorithm, showing that our algorithm outperforms the greedy algorithm in terms of the average time to record a desired story. In addition, experiments show that the algorithms tailored to specialized variants of the problem are rather more efficient than the general algorithm. 
    more » « less
  4. This paper presents an attention-based, deep learning framework that converts robot camera frames with dynamic content into static frames to more easily apply simultaneous localization and mapping (SLAM) algorithms. The vast majority of SLAM methods have difficulty in the presence of dynamic objects appearing in the environment and occluding the area being captured by the camera. Despite past attempts to deal with dynamic objects, challenges remain to reconstruct large, occluded areas with complex backgrounds. Our proposed Dynamic-GAN framework employs a generative adversarial network to remove dynamic objects from a scene and inpaint a static image free of dynamic objects. The Dynamic-GAN framework utilizes spatial-temporal transformers, and a novel spatial-temporal loss function. The evaluation of Dynamic-GAN was comprehensively conducted both quantitatively and qualitatively by testing it on benchmark datasets, and on a mobile robot in indoor navigation environments. As people appeared dynamically in close proximity to the robot, results showed that large, feature-rich occluded areas can be accurately reconstructed with our attention-based deep learning framework for dynamic object removal. Through experiments we demonstrate that our proposed algorithm has up to 25% better performance on average as compared to the standard benchmark algorithms. 
    more » « less
  5. null (Ed.)
    As demands on manufacturing rapidly evolve, flexible manufacturing is becoming more essential for acquiring the necessary productivity to remain competitive. An innovative approach to flexible manufacturing is the introduction of fenceless robotic manufacturing cells to acquire and leverage greater human-robot collaboration (HRC). This involves operations in which a human and a robot share a space, complete tasks together, and interact with each other. Such operations, however, pose serious safety concerns. Before HRC can become a viable possibility, robots must be capable of safely operating within and responding to events in dynamic environments. Furthermore, the robot must be able to do this quickly during online operation. This paper outlines an algorithm for predictive collision detection. This algorithm gives the robot the ability to look ahead at its trajectory, and the trajectories of other bodies in its environment and predict potential collisions. The algorithm approximates a continuous swept volume of any articulated body along its trajectory by taking only a few time sequential samples of the predicted orientations of the body and creating surfaces that patch the orientations together with Coons patches. Run time data collected on this algorithm suggest that the algorithm can accurately predict future collisions in under 30 ms. 
    more » « less