skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Hierarchical Control and Learning of a Foraging CyberOctopus
Inspired by the unique neurophysiology of the octopus, a hierarchical framework is proposed that simplifies the coordination of multiple soft arms by decomposing control into high‐level decision‐making, low‐level motor activation, and local reflexive behaviors via sensory feedback. When evaluated in the illustrative problem of a model octopus foraging for food, this hierarchical decomposition results in significant improvements relative to end‐to‐end methods. Performance is achieved through a mixed‐modes approach, whereby qualitatively different tasks are addressed via complementary control schemes. Herein, model‐free reinforcement learning is employed for high‐level decision‐making, while model‐based energy shaping takes care of arm‐level motor execution. To render the pairing computationally tenable, a novel neural network energy shaping (NN‐ES) controller is developed, achieving accurate motions with time‐to‐solutions 200 times faster than previous attempts. The hierarchical framework is then successfully deployed in increasingly challenging foraging scenarios, including an arena littered with obstacles in 3D space, demonstrating the viability of the approach.  more » « less
Award ID(s):
2209322 1830881
PAR ID:
10441484
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Advanced Intelligent Systems
Volume:
5
Issue:
9
ISSN:
2640-4567
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Robots acting in human-scale environments must plan under uncertainty in large state–action spaces and face constantly changing reward functions as requirements and goals change. Planning under uncertainty in large state–action spaces requires hierarchical abstraction for efficient computation. We introduce a new hierarchical planning framework called Abstract Markov Decision Processes (AMDPs) that can plan in a fraction of the time needed for complex decision making in ordinary MDPs. AMDPs provide abstract states, actions, and transition dynamics in multiple layers above a base-level “flat” MDP. AMDPs decompose problems into a series of subtasks with both local reward and local transition functions used to create policies for subtasks. The resulting hierarchical planning method is independently optimal at each level of abstraction, and is recursively optimal when the local reward and transition functions are correct. We present empirical results showing significantly improved planning speed, while maintaining solution quality, in the Taxi domain and in a mobile-manipulation robotics problem. Furthermore, our approach allows specification of a decision-making model for a mobile-manipulation problem on a Turtlebot, spanning from low-level control actions operating on continuous variables all the way up through high-level object manipulation tasks. 
    more » « less
  2. Robots acting in human-scale environments must plan under uncertainty in large state–action spaces and face constantly changing reward functions as requirements and goals change. Planning under uncertainty in large state–action spaces requires hierarchical abstraction for efficient computation. We introduce a new hierarchical planning framework called Abstract Markov Decision Processes (AMDPs) that can plan in a fraction of the time needed for complex decision making in ordinary MDPs. AMDPs provide abstract states, actions, and transition dynamics in multiple layers above a base-level “flat” MDP. AMDPs decompose problems into a series of subtasks with both local reward and local transition functions used to create policies for subtasks. The resulting hierarchical planning method is independently optimal at each level of abstraction, and is recursively optimal when the local reward and transition functions are correct. We present empirical results showing significantly improved planning speed, while maintaining solution quality, in the Taxi domain and in a mobile-manipulation robotics problem. Furthermore, our approach allows specification of a decision-making model for a mobile-manipulation problem on a Turtlebot, spanning from low-level control actions operating on continuous variables all the way up through high-level object manipulation tasks. 
    more » « less
  3. In recent years, the focus has been on enhancing user comfort in commercial buildings while cutting energy costs. Efforts have mainly centered on improving HVAC systems, the central control system. However, it’s evident that HVAC alone can’t ensure occupant comfort. Lighting, blinds, and windows, often overlooked, also impact energy use and comfort. This paper introduces a holistic approach to managing the delicate balance between energy efficiency and occupant comfort in commercial buildings. We presentOCTOPUS, a system employing a deep reinforcement learning (DRL) framework using data-driven techniques to optimize control sequences for all building subsystems, including HVAC, lighting, blinds, and windows.OCTOPUS’s DRL architecture features a unique reward function facilitating the exploration of tradeoffs between energy usage and user comfort, effectively addressing the high-dimensional control problem resulting from interactions among these four building subsystems. To meet data training requirements, we emphasize the importance of calibrated simulations that closely replicate target-building operational conditions. We trainOCTOPUSusing 10-year weather data and a calibrated building model in the EnergyPlus simulator. Extensive simulations demonstrate thatOCTOPUSachieves substantial energy savings, outperforming state-of-the-art rule-based and DRL-based methods by 14.26% and 8.1%, respectively, in a LEED Gold Certified building while maintaining desired human comfort levels. 
    more » « less
  4. Flexible octopus arms exhibit an exceptional ability to coordinate large numbers of degrees of freedom and perform complex manipulation tasks. As a consequence, these systems continue to attract the attention of biologists and roboticists alike. In this article, we develop a three-dimensional model of a soft octopus arm, equipped with biomechanically realistic muscle actuation. Internal forces and couples exerted by all major muscle groups are considered. An energy-shaping control method is described to coordinate muscle activity so as to grasp and reach in three-dimensional space. Key contributions of this article are as follows: (i) modelling of major muscle groups to elicit three-dimensional movements; (ii) a mathematical formulation for muscle activations based on a stored energy function; and (iii) a computationally efficient procedure to design task-specific equilibrium configurations, obtained by solving an optimization problem in the Special Euclidean group SE ( 3 ) . Muscle controls are then iteratively computed based on the co-state variable arising from the solution of the optimization problem. The approach is numerically demonstrated in the physically accurate software environmentElastica. Results of numerical experiments mimicking observed octopus behaviours are reported. 
    more » « less
  5. Ndiribe, Charlotte (Ed.)
    Population growth models typically incorporate attributes observable at the population scale, often overlooking the trade-off between individual-level reproductive and behavioral traits and their influence on population size. Individuals’ survival and reproductive abilities are expected to dynamically evolve depending on the population size, which is affected by the aggregation of individual decisions. Reconciling individual-level incentives with population-level dynamics requires an integrative framework that explicitly addresses the intertwined relationships between population growth and individual decision-making processes. We formulate a multiscale modeling framework that integrates the logistic population growth model with an optimal foraging model to study the interplay between individual-level behavioral incentives and population growth dynamics. Specifically, we explicitly model individuals’ decision-making process, which shapes their reproductive fitness and, ultimately, influences population growth. Moreover, we incorporate the concept of resource limitations from the logistic growth model to account for dynamic incentives that depend on population size. Our results yield insights into the multiscale processes, such as the selection pressure of behavioral choices and the cost-benefit of social activities that influence population robustness beyond mere size and aggregated reproductive traits. We found that populations exhibiting similar limiting sizes may undergo significantly different transient dynamics. This variation may be induced by environments imposing distinct behavioral cost-benefit trade-offs that require individuals to exert different levels of foraging effort to maintain reproductive viability. 
    more » « less