skip to main content


Title: DIPN: Deep Interaction Prediction Network with Application to Clutter Removal
We propose a Deep Interaction Prediction Net- work (DIPN) for learning to predict complex interactions that ensue as a robot end-effector pushes multiple objects, whose physical properties, including size, shape, mass, and friction coefficients may be unknown a priori. DIPN “imagines” the effect of a push action and generates an accurate synthetic image of the predicted outcome. DIPN is shown to be sample efficient when trained in simulation or with a real robotic system. The high accuracy of DIPN allows direct integration with a grasp network, yielding a robotic manipulation system capable of executing challenging clutter removal tasks while being trained in a fully self-supervised manner. The overall network demonstrates intelligent behavior in selecting proper actions between push and grasp for completing clutter removal tasks and significantly outperforms the previous state-of-the- art. Remarkably, DIPN achieves even better performance on the real robotic hardware system than in simulation.  more » « less
Award ID(s):
1845888 1734419
NSF-PAR ID:
10219061
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
IEEE International Conference on Robotics and Automation
ISSN:
1049-3492
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Generative Attention Learning (GenerAL) is a framework for high-DOF multi-fingered grasping that is not only robust to dense clutter and novel objects but also effective with a variety of different parallel-jaw and multi-fingered robot hands. This framework introduces a novel attention mechanism that substantially improves the grasp success rate in clutter. Its generative nature allows the learning of full-DOF grasps with flexible end-effector positions and orientations, as well as all finger joint angles of the hand. Trained purely in simulation, this framework skillfully closes the sim-to-real gap. To close the visual sim-to-real gap, this framework uses a single depth image as input. To close the dynamics sim-to-real gap, this framework circumvents continuous motor control with a direct mapping from pixel to Cartesian space inferred from the same depth image. Finally, this framework demonstrates inter-robot generality by achieving over 92% real-world grasp success rates in cluttered scenes with novel objects using two multi-fingered robotic hand-arm systems with different degrees of freedom. 
    more » « less
  2. The central theme in robotic manipulation is that of the robot interacting with the world through physical contact. We tend to describe that physical contact using specific words that capture the nature of the contact and the action, such as grasp, roll, pivot, push, pull, tilt, close, open etc. We refer to these situation-specific actions as manipulation primitives. Due to the nonlinear and nonsmooth nature of physical interaction, roboticists have devoted significant efforts towards studying individual manipulation primitives. However, studying individual primitives one by one is an inherently limited process, due engineering costs, overfitting to specific tasks, and lack of robustness to unforeseen variations. These limitations motivate the main contribution of this paper: a complete and general framework to autogenerate manipulation primitives. To do so, we develop the theory and computation of contact modes as a means to classify and enumerate manipulation primitives. The contact modes form a graph, specifically a lattice. Our algorithm to autogenerate manipulation primitives (AMP) performs graph-based optimization on the contact mode lattice and solves a linear program to generate each primitive. We designed several experiments to validate our approach. We benchmarked a wide range of contact scenarios and our pipeline’s runtime was consistently in the 10 s of milliseconds. In simulation, we planned manipulation sequences using AMP. In the real-world, we showcased the robustness of our approach to real-world modeling errors. We hope that our contributions will lead to more general and robust approaches for robotic manipulation.

     
    more » « less
  3. null (Ed.)
    Robot manipulation and grasping mechanisms have received considerable attention in the recent past, leading to development of wide-range of industrial applications. This paper proposes the development of an autonomous robotic grasping system for object sorting application. RGB-D data is used by the robot for performing object detection, pose estimation, trajectory generation and object sorting tasks. The proposed approach can also handle grasping on certain objects chosen by users. Trained convolutional neural networks are used to perform object detection and determine the corresponding point cloud cluster of the object to be grasped. From the selected point cloud data, a grasp generator algorithm outputs potential grasps. A grasp filter then scores these potential grasps, and the highest-scored grasp will be chosen to execute on a real robot. A motion planner will generate collision-free trajectories to execute the chosen grasp. The experiments on AUBO robotic manipulator show the potentials of the proposed approach in the context of autonomous object sorting with robust and fast sorting performance. 
    more » « less
  4. null (Ed.)
    bot manipulation and grasping mechanisms have received considerable attention in the recent past, leading to development of wide-range of industrial applications. This paper proposes the development of an autonomous robotic grasping system for object sorting application. RGB-D data is used by the robot for performing object detection, pose estimation, trajectory generation and object sorting tasks. The proposed approach can also handle grasping on certain objects chosen by users. Trained convolutional neural networks are used to perform object detection and determine the corresponding point cloud cluster of the object to be grasped. From the selected point cloud data, a grasp generator algorithm outputs potential grasps. A grasp filter then scores these potential grasps, and the highest-scored grasp will be chosen to execute on a real robot. A motion planner will generate collision-free trajectories to execute the chosen grasp. The experiments on AUBO robotic manipulator show the potentials of the proposed approach in the context of autonomous object sorting with robust and fast sorting performance. 
    more » « less
  5. Advancements in robot-assisted surgery have been rapidly growing since two decades ago. More recently, the automation of robotic surgical tasks has become the focus of research. In this area, the detection and tracking of a surgical tool are crucial for an autonomous system to plan and perform a procedure. For example, knowing the position and posture of a needle is a prerequisite for an automatic suturing system to grasp it and perform suturing tasks. In this paper, we proposed a novel method, based on Deep Learning and Point-to-point Registration, to track the 6 degrees of freedom (DOF) pose of a metal suture needle from a robotic endoscope (an Endoscopic Camera Manipulator from the da Vinci Robotic Surgical Systems), without the help of any marker. The proposed approach was implemented and evaluated in a standard simulated surgical environment provided by the 2021–2022 AccelNet Surgical Robotics Challenge, thus demonstrates the potential to be translated into a real-world scenario. A customized dataset containing 836 images collected from the simulated scene with ground truth of poses and key points information was constructed to train the neural network model. The best pipeline achieved an average position error of 1.76 mm while the average orientation error is 8.55 degrees, and it can run up to 10 Hz on a PC. 
    more » « less