skip to main content

Title: Conditioning Style on Substance: Plans for Narrative Observation
We consider a robot tasked with observing its environment and later selectively summarizing what it saw as a vivid, structured narrative. The robot interacts with an uncertain environment, modelled as a stochastic process, and must decide what events to pay attention to (substance), and how to best make its recording (style) for later compilation of its summary. If carrying a video camera, for example, it must decide where to be, what to aim the camera at, and which stylistic selections, like the focus and level of zoom, are most suitable. This paper examines planning algorithms that help the robot predict events that (1) will likely occur; (2) would be useful in telling a tale; and (3) may be hewed to cohere stylistically. The third factor, a time-extended requirement, is entirely neglected in earlier, simpler work. With formulations based on underlying Markov Decision Processes, we compare two algorithms: a monolithic planner that jointly plans over events and style pairs and a decoupled approach that prescribes style conditioned on events. The decoupled approach is seen to be effective and much faster to compute, suggesting that computational expediency justifies the separation of substance from style. Finally, we also report on our hardware implementation.
; ; ; ; ;
Award ID(s):
1849303 1849291 1849249
Publication Date:
Journal Name:
2021 IEEE International Conference on Robotics and Automation (ICRA)
Sponsoring Org:
National Science Foundation
More Like this
  1. One important class of applications entails a robot scrutinizing, monitoring, or recording the evolution of an uncertain time-extended process. This sort of situation leads to an interesting family of active perception problems that can be cast as planning problems in which the robot is limited in what it sees and must, thus, choose what to pay attention to. The distinguishing characteristic of this setting is that the robot has influence over what it captures via its sensors, but exercises no causal authority over the process evolving in the world. As such, the robot’s objective is to observe the underlying process and to produce a “chronicle” of occurrent events, subject to a goal specification of the sorts of event sequences that may be of interest. This paper examines variants of such problems in which the robot aims to collect sets of observations to meet a rich specification of their sequential structure. We study this class of problems by modeling a stochastic process via a variant of a hidden Markov model and specify the event sequences of interest as a regular language, developing a vocabulary of “mutators” that enable sophisticated requirements to be expressed. Under different suppositions on the information gleaned aboutmore »the event model, we formulate and solve different planning problems. The core underlying idea is the construction of a product between the event model and a specification automaton. Using this product, we compute a policy that minimizes the expected number of steps to reach a goal state. We introduce a general algorithm for this problem as well as several more efficient algorithms for important special cases. The paper reports and compares performance metrics by drawing on some small case studies analyzed in depth via simulation. Specifically, we study the effect of the robot’s observation model on the average time required for the robot to record a desired story. We also compare our algorithm with a baseline greedy algorithm, showing that our algorithm outperforms the greedy algorithm in terms of the average time to record a desired story. In addition, experiments show that the algorithms tailored to specialized variants of the problem are rather more efficient than the general algorithm.

    « less
  2. A robot can now grasp an object more effectively than ever before, but once it has the object what happens next? We show that a mild relaxation of the task and workspace constraints implicit in existing object grasping datasets can cause neural network based grasping algorithms to fail on even a simple block stacking task when executed under more realistic circumstances. To address this, we introduce the JHU CoSTAR Block Stacking Dataset (BSD), where a robot interacts with 5.1 cm colored blocks to complete an order-fulfillment style block stacking task. It contains dynamic scenes and real time-series data in a less constrained environment than comparable datasets. There are nearly 12,000 stacking attempts and over 2 million frames of real data. We discuss the ways in which this dataset provides a valuable resource for a broad range of other topics of investigation. We find that hand-designed neural networks that work on prior datasets do not generalize to this task. Thus, to establish a baseline for this dataset, we demonstrate an automated search of neural network based models using a novel multiple-input HyperTree MetaModel, and find a final model which makes reasonable 3D pose predictions for grasping and stacking on our dataset.more »The CoSTAR BSD, code, and instructions are available at« less
  3. As demands on manufacturing rapidly evolve, flexible manufacturing is becoming more essential for acquiring the necessary productivity to remain competitive. An innovative approach to flexible manufacturing is the introduction of fenceless robotic manufacturing cells to acquire and leverage greater human-robot collaboration (HRC). This involves operations in which a human and a robot share a space, complete tasks together, and interact with each other. Such operations, however, pose serious safety concerns. Before HRC can become a viable possibility, robots must be capable of safely operating within and responding to events in dynamic environments. Furthermore, the robot must be able to do this quickly during online operation. This paper outlines an algorithm for predictive collision detection. This algorithm gives the robot the ability to look ahead at its trajectory, and the trajectories of other bodies in its environment and predict potential collisions. The algorithm approximates a continuous swept volume of any articulated body along its trajectory by taking only a few time sequential samples of the predicted orientations of the body and creating surfaces that patch the orientations together with Coons patches. Run time data collected on this algorithm suggest that the algorithm can accurately predict future collisions in under 30 ms.
  4. We present a distributed Bayesian algorithm for robot swarms to classify a spatially distributed feature of an environment. This type of “go/no-go” decision appears in applications where a group of robots must collectively choose whether to take action, such as determining if a farm field should be treated for pests. Previous bio-inspired approaches to decentralized decision-making in robotics lack a statistical foundation, while decentralized Bayesian algorithms typically require a strongly connected network of robots. In contrast,our algorithm allows simple, sparsely distributed robots to quickly reach accurate decisions about a binary feature of their environment. We investigate the speed vs. accuracy tradeoff in decision-making by varying the algorithm’s parameters.We show that making fewer, less-correlated observations can improve decision-making accuracy, and that a well-chosen combination of prior and decision threshold allows for fast decisions with a small accuracy cost. Both speed and accuracy also improved with the addition of bio-inspired positive feed-back. This algorithm is also adaptable to the difficulty of the environment. Compared to a fixed-time benchmark algorithm with accuracy guarantees, our Bayesian approach resulted in equally accurate decisions, while adapting its decision time to the difficulty of the environment.
  5. Abstract The spatial distribution and kinematics of intracontinental deformation provide insight into the dominant mode of continental tectonics: rigid-body motion versus continuum flow. The discrete San Andreas fault defines the western North America plate boundary, but transtensional deformation is distributed hundreds of kilometers eastward across the Walker Lane–Basin and Range provinces. In particular, distributed Basin and Range extension has been encroaching westward onto the relatively stable Sierra Nevada block since the Miocene, but the timing and style of distributed deformation overprinting the stable Sierra Nevada crust remains poorly resolved. Here we bracket the timing, magnitude, and kinematics of overprinting Walker Lane and Basin and Range deformation in the Pine Nut Mountains, Nevada (USA), which are the westernmost structural and topographic expression of the Basin and Range, with new geologic mapping and 40Ar/39Ar geochronology. Structural mapping suggests that north-striking normal faults developed during the initiation of Basin and Range extension and were later reactivated as northeast-striking oblique-slip faults following the onset of Walker Lane transtensional deformation. Conformable volcanic and sedimentary rocks, with new ages spanning ca. 14.2 Ma to 6.8 Ma, were tilted 30°–36° northwest by east-dipping normal faults. This relationship demonstrates that dip-slip deformation initiated after ca. 6.8 Ma. Amore »retrodeformed cross section across the range suggests that the range experienced 14% extension. Subsequently, Walker Lane transtension initiated, and clockwise rotation of the Carson domain may have been accommodated by northeast-striking left-slip faults. Our work better defines strain patterns at the western extent of the Basin and Range province across an approximately 150-km-long east-west transect that reveals domains of low strain (∼15%) in the Carson Range–Pine Nut Mountains and Gillis Range surrounding high-magnitude extension (∼150%–180%) in the Singatse and Wassuk Ranges. There is no evidence for irregular crustal thickness variations across this same transect—either in the Mesozoic, prior to extension, or today—which suggests that strain must be accommodated differently at decoupled crustal levels to result in smooth, homogenous crustal thickness values despite the significantly heterogeneous extensional evolution. This example across an ∼150 km transect demonstrates that the use of upper-crust extension estimates to constrain pre-extension crustal thickness, assuming pure shear as commonly done for the Mesozoic Nevadaplano orogenic plateau, may not be reliable.« less