skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Optimizing a Continuum Manipulator’s Search Policy Through Model-Free Reinforcement Learning
Continuum robots have long held a great potential for applications in inspection of remote, hard-to-reach environments. In future environments such as the Deep Space Gateway, remote deployment of robotic solutions will require a high level of autonomy due to communication delays and unavailability of human crews. In this work, we explore the application of policy optimization methods through Actor-Critic gradient descent in order to optimize a continuum manipulator’s search method for an unknown object. We show that we can deploy a continuum robot without prior knowledge of a goal object location and converge to a policy that finds the goal and can be reused in future deployments. We also show that the method can be quickly extended for multiple Degrees-of-Freedom and that we can restrict the policy with virtual and physical obstacles. These two scenarios are highlighted using a simulation environment with 15 and 135 unique states, respectively.  more » « less
Award ID(s):
1718075
PAR ID:
10295537
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Page Range / eLocation ID:
5564-5571
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. It is imperative that robots can understand natural language commands issued by humans. Such commands typically contain verbs that signify what action should be performed on a given object and that are applicable to many objects. We propose a method for generalizing manipulation skills to novel objects using verbs. Our method learns a probabilistic classifier that determines whether a given object trajectory can be described by a specific verb. We show that this classifier accurately generalizes to novel object categories with an average accuracy of 76.69% across 13 object categories and 14 verbs. We then perform policy search over the object kinematics to find an object trajectory that maximizes classifier prediction for a given verb. Our method allows a robot to generate a trajectory for a novel object based on a verb, which can then be used as input to a motion planner. We show that our model can generate trajectories that are usable for executing five verb commands applied to novel instances of two different object categories on a real robot. 
    more » « less
  2. Scene-level Programming by Demonstration (PbD) is faced with an important challenge - perceptual uncertainty. Addressing this problem, we present a scene-level PbD paradigm that programs robots to perform goal-directed manipulation in unstructured environments with grounded perception. Scene estimation is enabled by our discriminatively-informed generative scene estimation method (DIGEST). Given scene observations, DIGEST utilizes candidates from discriminative object detectors to generate and evaluate hypothesized scenes of object poses. Scene graphs are generated from the estimated object poses, which in turn is used in the PbD system for high-level task planning. We demonstrate that DIGEST performs better than existing method and is robust to false positive detections. Building a PbD system on DIGEST, we show experiments of programming a Fetch robot to set up a tray for delivery with various objects through demonstration of goal scenes. 
    more » « less
  3. null (Ed.)
    Manipulation tasks can often be decomposed into multiple subtasks performed in parallel, e.g., sliding an object to a goal pose while maintaining con- tact with a table. Individual subtasks can be achieved by task-axis controllers defined relative to the objects being manipulated, and a set of object-centric controllers can be combined in an hierarchy. In prior works, such combinations are defined manually or learned from demonstrations. By contrast, we propose using reinforcement learning to dynamically compose hierarchical object-centric controllers for manipulation tasks. Experiments in both simulation and real world show how the proposed approach leads to improved sample efficiency, zero-shot generalization to novel test environments, and simulation-to-reality transfer with- out fine-tuning. 
    more » « less
  4. This work focuses on active galactic nuclei (AGNs) and on the relation between the sizes of the hot dust continuum and the broad-line region (BLR). We find that the continuum size measured using optical/near-infrared interferometry (OI) is roughly twice that measured by reverberation mapping (RM). Both OI and RM continuum sizes show a tight relation with the H β BLR size, with only an intrinsic scatter of 0.25 dex. The masses of supermassive black holes (BHs) can hence simply be derived from a dust size in combination with a broad line width and virial factor. Since the primary uncertainty of these BH masses comes from the virial factor, the accuracy of the continuum-based BH masses is close to those based on the RM measurement of the broad emission line. Moreover, the necessary continuum measurements can be obtained on a much shorter timescale than those required monitoring for RM, and they are also more time efficient than those needed to resolve the BLR with OI. The primary goal of this work is to demonstrate a measuring of the BH mass based on the dust-continuum size with our first calibration of the R BLR – R d relation. The current limitation and caveats are discussed in detail. Future GRAVITY observations are expected to improve the continuum-based method and have the potential of measuring BH masses for a large sample of AGNs in the low-redshift Universe. 
    more » « less
  5. Several recent studies have demonstrated the promise of deep visuomotor policies for robot manipulator control. Despite impressive progress, these systems are known to be vulnerable to physical disturbances, such as accidental or adversarial bumps that make them drop the manipulated object. They also tend to be distracted by visual disturbances such as objects moving in the robot’s field of view, even if the disturbance does not physically prevent the execution of the task. In this paper, we propose an approach for augmenting a deep visuomotor policy trained through demonstrations with Task Focused visual Attention (TFA). The manipulation task is specified with a natural language text such as “move the red bowl to the left”. This allows the visual attention component to concentrate on the current object that the robot needs to manipulate. We show that even in benign environments, the TFA allows the policy to consistently outperform a variant with no attention mechanism. More importantly, the new policy is significantly more robust: it regularly recovers from severe physical disturbances (such as bumps causing it to drop the object) from which the baseline policy, i.e. with no visual attention, almost never recovers. In addition, we show that the proposed policy performs correctly in the presence of a wide class of visual disturbances, exhibiting a behavior reminiscent of human selective visual attention experiments. 
    more » « less