skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Hierarchical Control and Learning of a Foraging CyberOctopus
Inspired by the unique neurophysiology of the octopus, a hierarchical framework is proposed that simplifies the coordination of multiple soft arms by decomposing control into high‐level decision‐making, low‐level motor activation, and local reflexive behaviors via sensory feedback. When evaluated in the illustrative problem of a model octopus foraging for food, this hierarchical decomposition results in significant improvements relative to end‐to‐end methods. Performance is achieved through a mixed‐modes approach, whereby qualitatively different tasks are addressed via complementary control schemes. Herein, model‐free reinforcement learning is employed for high‐level decision‐making, while model‐based energy shaping takes care of arm‐level motor execution. To render the pairing computationally tenable, a novel neural network energy shaping (NN‐ES) controller is developed, achieving accurate motions with time‐to‐solutions 200 times faster than previous attempts. The hierarchical framework is then successfully deployed in increasingly challenging foraging scenarios, including an arena littered with obstacles in 3D space, demonstrating the viability of the approach.  more » « less
Award ID(s):
2209322 1830881
PAR ID:
10441484
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Advanced Intelligent Systems
Volume:
5
Issue:
9
ISSN:
2640-4567
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Robots acting in human-scale environments must plan under uncertainty in large state–action spaces and face constantly changing reward functions as requirements and goals change. Planning under uncertainty in large state–action spaces requires hierarchical abstraction for efficient computation. We introduce a new hierarchical planning framework called Abstract Markov Decision Processes (AMDPs) that can plan in a fraction of the time needed for complex decision making in ordinary MDPs. AMDPs provide abstract states, actions, and transition dynamics in multiple layers above a base-level “flat” MDP. AMDPs decompose problems into a series of subtasks with both local reward and local transition functions used to create policies for subtasks. The resulting hierarchical planning method is independently optimal at each level of abstraction, and is recursively optimal when the local reward and transition functions are correct. We present empirical results showing significantly improved planning speed, while maintaining solution quality, in the Taxi domain and in a mobile-manipulation robotics problem. Furthermore, our approach allows specification of a decision-making model for a mobile-manipulation problem on a Turtlebot, spanning from low-level control actions operating on continuous variables all the way up through high-level object manipulation tasks. 
    more » « less
  2. Robots acting in human-scale environments must plan under uncertainty in large state–action spaces and face constantly changing reward functions as requirements and goals change. Planning under uncertainty in large state–action spaces requires hierarchical abstraction for efficient computation. We introduce a new hierarchical planning framework called Abstract Markov Decision Processes (AMDPs) that can plan in a fraction of the time needed for complex decision making in ordinary MDPs. AMDPs provide abstract states, actions, and transition dynamics in multiple layers above a base-level “flat” MDP. AMDPs decompose problems into a series of subtasks with both local reward and local transition functions used to create policies for subtasks. The resulting hierarchical planning method is independently optimal at each level of abstraction, and is recursively optimal when the local reward and transition functions are correct. We present empirical results showing significantly improved planning speed, while maintaining solution quality, in the Taxi domain and in a mobile-manipulation robotics problem. Furthermore, our approach allows specification of a decision-making model for a mobile-manipulation problem on a Turtlebot, spanning from low-level control actions operating on continuous variables all the way up through high-level object manipulation tasks. 
    more » « less
  3. In recent years, the focus has been on enhancing user comfort in commercial buildings while cutting energy costs. Efforts have mainly centered on improving HVAC systems, the central control system. However, it’s evident that HVAC alone can’t ensure occupant comfort. Lighting, blinds, and windows, often overlooked, also impact energy use and comfort. This paper introduces a holistic approach to managing the delicate balance between energy efficiency and occupant comfort in commercial buildings. We presentOCTOPUS, a system employing a deep reinforcement learning (DRL) framework using data-driven techniques to optimize control sequences for all building subsystems, including HVAC, lighting, blinds, and windows.OCTOPUS’s DRL architecture features a unique reward function facilitating the exploration of tradeoffs between energy usage and user comfort, effectively addressing the high-dimensional control problem resulting from interactions among these four building subsystems. To meet data training requirements, we emphasize the importance of calibrated simulations that closely replicate target-building operational conditions. We trainOCTOPUSusing 10-year weather data and a calibrated building model in the EnergyPlus simulator. Extensive simulations demonstrate thatOCTOPUSachieves substantial energy savings, outperforming state-of-the-art rule-based and DRL-based methods by 14.26% and 8.1%, respectively, in a LEED Gold Certified building while maintaining desired human comfort levels. 
    more » « less
  4. Flexible octopus arms exhibit an exceptional ability to coordinate large numbers of degrees of freedom and perform complex manipulation tasks. As a consequence, these systems continue to attract the attention of biologists and roboticists alike. In this article, we develop a three-dimensional model of a soft octopus arm, equipped with biomechanically realistic muscle actuation. Internal forces and couples exerted by all major muscle groups are considered. An energy-shaping control method is described to coordinate muscle activity so as to grasp and reach in three-dimensional space. Key contributions of this article are as follows: (i) modelling of major muscle groups to elicit three-dimensional movements; (ii) a mathematical formulation for muscle activations based on a stored energy function; and (iii) a computationally efficient procedure to design task-specific equilibrium configurations, obtained by solving an optimization problem in the Special Euclidean group SE ( 3 ) . Muscle controls are then iteratively computed based on the co-state variable arising from the solution of the optimization problem. The approach is numerically demonstrated in the physically accurate software environmentElastica. Results of numerical experiments mimicking observed octopus behaviours are reported. 
    more » « less
  5. We present a hierarchical control approach for maneuvering an autonomous vehicle (AV) in tightly-constrained environments where other moving AVs and/or human driven vehicles are present. A two-level hierarchy is proposed: a high-level data-driven strategy predictor and a lower-level model-based feedback controller. The strategy predictor maps an encoding of a dynamic environment to a set of high-level strategies via a neural network. Depending on the selected strategy, a set of time-varying hyperplanes in the AV’s position space is generated online and the corresponding halfspace constraints are included in a lower-level model-based receding horizon controller. These strategy-dependent constraints drive the vehicle towards areas where it is likely to remain feasible. Moreover, the predicted strategy also informs switching between a discrete set of policies, which allows for more conservative behavior when prediction confidence is low. We demonstrate the effectiveness of the proposed data-driven hierarchical control framework in a two-car collision avoidance scenario through simulations and experiments on a 1/10 scale autonomous car platform where the strategy-guided approach outperforms a model predictive control baseline in both cases. 
    more » « less