A great number of robotics applications demand the rearrangement of many mobile objects, for example, organizing products on store shelves, shuffling containers at shipping ports, reconfiguring fleets of mobile robots, and so on. To boost the efficiency/throughput in systems designed for solving these rearrangement problems, it is essential to minimize the number of atomic operations that are involved, for example, the pick-n-places of individual objects. However, this optimization task poses a rather difficult challenge due to the complex inter-dependency between the objects, especially when they are tightly packed together. In this work, in tackling the aforementioned challenges, we have developed a novel algorithmic tool, called Rubik Tables, that provides a clean abstraction of object rearrangement problems as the proxy problem of shuffling items stored in a table or lattice. In its basic form, a Rubik Table is an n × n table containing n2items. We show that the reconfiguration of items in such a Rubik Table can be achieved using at most n column and n row shuffles in the partially labeled setting, where each column (resp., row) shuffle may arbitrarily permute the items stored in a column (resp., row) of the table. When items are fully distinguishable, additional nmore »
- Award ID(s):
- Publication Date:
- NSF-PAR ID:
- Journal Name:
- IEEE International Conference on Robotics and Automation
- Sponsoring Org:
- National Science Foundation
More Like this
Intelligent robots frequently need to explore the objects in their working environments. Modern sensors have enabled robots to learn object properties via perception of multiple modalities. However, object exploration in the real world poses a challenging trade-off between information gains and exploration action costs. Mixed observability Markov decision process (MOMDP) is a framework for planning under uncertainty, while accounting for both fully and partially observable components of the state. Robot perception frequently has to face such mixed observability. This work enables a robot equipped with an arm to dynamically construct query-oriented MOMDPs for multi-modal predicate identification (MPI) of objects. The robot's behavioral policy is learned from two datasets collected using real robots. Our approach enables a robot to explore object properties in a way that is significantly faster while improving accuracies in comparison to existing methods that rely on hand-coded exploration strategies.
The paper introduces a new algorithm for planning in partially observable Markov decision processes (POMDP) based on the idea of aggregate simulation. The algorithm uses product distributions to approximate the belief state and shows how to build a representation graph of an approximate action-value function over belief space. The graph captures the result of simulating the model in aggregate under independence assumptions, giving a symbolic representation of the value function. The algorithm supports large observation spaces using sampling networks, a representation of the process of sampling values of observations, which is integrated into the graph representation. Following previous work in MDPs this approach enables action selection in POMDPs through gradient optimization over the graph representation. This approach complements recent algorithms for POMDPs which are based on particle representations of belief states and an explicit search for action selection. Our approach enables scaling to large factored action spaces in addition to large state spaces and observation spaces. An experimental evaluation demonstrates that the algorithm provides excellent performance relative to state of the art in large POMDP problems.
Modeling human activity dynamics: an object-class oriented space–time composite model based on social media and urban infrastructure data
Modeling human activity dynamics is important for many application domains. However, there are problems inherent in modeling population information, since the number of people inside a given area can change dynamically over time. Here, a cyberGIS-enabled spatiotemporal population model is developed by combining Twitter data with urban infrastructure registry data to estimate human activity dynamics. This model is an object-class oriented space–time composite model, in which real-world phenomena are modeled as spatiotemporal objects, and people can move from one object to another over time. In this research, all spatiotemporal objects are aggregated into 14 spatiotemporal object classes, and all objects in a given space at different times can be projected down to a spatial plane to generate a common spatiotemporal map. A temporal weight matrix is derived from Twitter activity curves for each spatiotemporal object class and represents population dynamics for each object class at different hours of a day. Finally, model performance is evaluated by using a comparison to registered census data. This spatiotemporal human activity dynamics model was developed in a cyberGIS computing environment, which enables computational and data intensive problem solving. The results of this research can be used to support spatial decision-making in various applicationmore »
Multimodal estimation and communication of latent semantic knowledge for robust execution of robot instructionsThe goal of this article is to enable robots to perform robust task execution following human instructions in partially observable environments. A robot’s ability to interpret and execute commands is fundamentally tied to its semantic world knowledge. Commonly, robots use exteroceptive sensors, such as cameras or LiDAR, to detect entities in the workspace and infer their visual properties and spatial relationships. However, semantic world properties are often visually imperceptible. We posit the use of non-exteroceptive modalities including physical proprioception, factual descriptions, and domain knowledge as mechanisms for inferring semantic properties of objects. We introduce a probabilistic model that fuses linguistic knowledge with visual and haptic observations into a cumulative belief over latent world attributes to infer the meaning of instructions and execute the instructed tasks in a manner robust to erroneous, noisy, or contradictory evidence. In addition, we provide a method that allows the robot to communicate knowledge dissonance back to the human as a means of correcting errors in the operator’s world model. Finally, we propose an efficient framework that anticipates possible linguistic interactions and infers the associated groundings for the current world state, thereby bootstrapping both language understanding and generation. We present experiments on manipulators for tasks thatmore »