skip to main content


Title: Reinforcement Learning Assist-as-needed Control for Robot Assisted Gait Training
The primary goal of an assist-as-needed (AAN) controller is to maximize subjects' active participation during motor training tasks while allowing moderate tracking errors to encourage human learning of a target movement. Impedance control is typically employed by AAN controllers to create a compliant force-field around the desired motion trajectory. To accommodate different individuals with varying motor abilities, most of the existing AAN controllers require extensive manual tuning of the control parameters, resulting in a tedious and time-consuming process. In this paper, we propose a reinforcement learning AAN controller that can autonomously reshape the force-field in real-time based on subjects' training performances. The use of action-dependent heuristic dynamic programming enables a model-free implementation of the proposed controller. To experimentally validate the controller, a group of healthy individuals participated in a gait training session wherein they were asked to learn a modified gait pattern with the help of a powered ankle-foot orthosis. Results indicated the potential of the proposed control strategy for robot-assisted gait training.  more » « less
Award ID(s):
1944203
NSF-PAR ID:
10216891
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
2020 8th IEEE RAS/EMBS International Conference for Biomedical Robotics and Biomechatronics (BioRob)
Page Range / eLocation ID:
785 to 790
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Background Few studies have systematically investigated robust controllers for lower limb rehabilitation exoskeletons (LLREs) that can safely and effectively assist users with a variety of neuromuscular disorders to walk with full autonomy. One of the key challenges for developing such a robust controller is to handle different degrees of uncertain human-exoskeleton interaction forces from the patients. Consequently, conventional walking controllers either are patient-condition specific or involve tuning of many control parameters, which could behave unreliably and even fail to maintain balance. Methods We present a novel, deep neural network, reinforcement learning-based robust controller for a LLRE based on a decoupled offline human-exoskeleton simulation training with three independent networks, which aims to provide reliable walking assistance against various and uncertain human-exoskeleton interaction forces. The exoskeleton controller is driven by a neural network control policy that acts on a stream of the LLRE’s proprioceptive signals, including joint kinematic states, and subsequently predicts real-time position control targets for the actuated joints. To handle uncertain human interaction forces, the control policy is trained intentionally with an integrated human musculoskeletal model and realistic human-exoskeleton interaction forces. Two other neural networks are connected with the control policy network to predict the interaction forces and muscle coordination. To further increase the robustness of the control policy to different human conditions, we employ domain randomization during training that includes not only randomization of exoskeleton dynamics properties but, more importantly, randomization of human muscle strength to simulate the variability of the patient’s disability. Through this decoupled deep reinforcement learning framework, the trained controller of LLREs is able to provide reliable walking assistance to patients with different degrees of neuromuscular disorders without any control parameter tuning. Results and conclusion A universal, RL-based walking controller is trained and virtually tested on a LLRE system to verify its effectiveness and robustness in assisting users with different disabilities such as passive muscles (quadriplegic), muscle weakness, or hemiplegic conditions without any control parameter tuning. Analysis of the RMSE for joint tracking, CoP-based stability, and gait symmetry shows the effectiveness of the controller. An ablation study also demonstrates the strong robustness of the control policy under large exoskeleton dynamic property ranges and various human-exoskeleton interaction forces. The decoupled network structure allows us to isolate the LLRE control policy network for testing and sim-to-real transfer since it uses only proprioception information of the LLRE (joint sensory state) as the input. Furthermore, the controller is shown to be able to handle different patient conditions without the need for patient-specific control parameter tuning. 
    more » « less
  2. null (Ed.)
    In this work, we introduce a novel approach to assistive exoskeleton (or powered orthosis) control which avoids needing task and gait phase information. Our approach is based on directly designing the Hamiltonian dynamics of the target closed-loop behavior, shaping the energy of the human and the robot. Relative to previous energy shaping controllers for assistive exoskeletons, we introduce ground reaction force and torque information into the target behavior definition, reformulate the kinematics so as to avoid explicit matching conditions due to under-actuation, and avoid the need to switch between swing and stance energy shapes. Our controller introduces new states into the target Hamiltonian energy that represent a virtual second leg that is connected to the physical leg using virtual springs. The impulse the human imparts to the physical leg is amplified and applied to the virtual leg, but the ground reaction force acts only on the physical leg. A state transformation allows the proposed control to be available using only encoders, an IMU, and ground reaction force sensors. We prove that this controller is stable and passive when acted on by the ground reaction force and demonstrate the controller's strength amplifying behavior in a simulation. A linear analysis based on small signal assumptions allows us to explain the relationship between our tuning parameters and the frequency domain amplification bandwidth. 
    more » « less
  3. Powered prostheses aim to mimic the missing biological limb with controllers that are finely tuned to replicate the nominal gait pattern of non-amputee individuals. Unfortunately, this control approach poses a problem with real-world ambulation, which includes tasks such as crossing over obstacles, where the prosthesis trajectory must be modified to provide adequate foot clearance and ensure timely foot placement. Here, we show an indirect volitional control approach that enables prosthesis users to walk at different speeds while smoothly and continuously crossing over obstacles of different sizes without explicit classification of the environment. At the high level, the proposed controller relies on a heuristic algorithm to continuously change the maximum knee flexion angle and the swing duration in harmony with the user’s residual limb. At the low level, minimum-jerk planning is used to continuously adapt the swing trajectory while maximizing smoothness. Experiments with three individuals with above-knee amputation show that the proposed control approach allows for volitional control of foot clearance, which is necessary to negotiate environmental barriers. Our study suggests that a powered prosthesis controller with intrinsic, volitional adaptability may provide prosthesis users with functionality that is not currently available, facilitating real-world ambulation.

     
    more » « less
  4. null (Ed.)
    Powered ankle exoskeletons that apply assistive torques with optimized timing and magnitude can reduce metabolic cost by ∼10% compared to normal walking. However, finding individualized optimal control parameters is time consuming and must be done independently for different walking modes (e.g., speeds, slopes). Thus, there is a need for exoskeleton controllers that are capable of continuously adapting torque assistance in concert with changing locomotor demands. One option is to use a biologically inspired, model-based control scheme that can capture the adaptive behavior of the human plantarflexors during natural gait. Here, based on previously demonstrated success in a powered ankle-foot prosthesis, we developed an ankle exoskeleton controller that uses a neuromuscular model (NMM) comprised of a Hill type musculotendon driven by a simple positive force feedback reflex loop. To examine the effects of NMM reflex parameter settings on (i) ankle exoskeleton mechanical performance and (ii) users’ physiological response, we recruited nine healthy, young adults to walk on a treadmill at a fixed speed of 1.25 m/s while donning bilateral tethered robotic ankle exoskeletons. To quantify exoskeleton mechanics, we measured exoskeleton torque and power output across a range of NMM controller Gain (0.8–2.0) and Delay (10–40 ms) settings, as well as a High Gain/High Delay (2.0/40 ms) combination. To quantify users’ physiological response, we compared joint kinematics and kinetics, ankle muscle electromyography and metabolic rate between powered and unpowered/zero-torque conditions. Increasing NMM controller reflex Gain caused increases in average ankle exoskeleton torque and net power output, while increasing NMM controller reflex Delay caused a decrease in net ankle exoskeleton power output. Despite systematic reduction in users’ average biological ankle moment with exoskeleton mechanical assistance, we found no NMM controller Gain or Delay settings that yielded changes in metabolic rate. Post hoc analyses revealed weak association at best between exoskeleton and biological mechanics and changes in users’ metabolic rate. Instead, changes in users’ summed ankle joint muscle activity with powered assistance correlated with changes in their metabolic energy use, highlighting the potential to utilize muscle electromyography as a target for on-line optimization in next generation adaptive exoskeleton controllers. 
    more » « less
  5. We consider the problem of optimal control of district cooling energy plants (DCEPs) consisting of multiple chillers, a cooling tower, and a thermal energy storage (TES), in the presence of time-varying electricity price. A straightforward application of model predictive control (MPC) requires solving a challenging mixed-integer nonlinear program (MINLP) because of the on/off of chillers and the complexity of the DCEP model. Reinforcement learning (RL) is an attractive alternative since its real-time control computation is much simpler. But designing an RL controller is challenging due to myriad design choices and computationally intensive training. In this paper, we propose an RL controller and an MPC controller for minimizing the electricity cost of a DCEP and compare them via simulations. The two controllers are designed to be comparable in terms of objective and information requirements. The RL controller uses a novel Q-learning algorithm that is based on least-squares policy iteration. We describe the design choices for the RL controller, including the choice of state space and basis functions, that are found to be effective. The proposed MPC controller does not need a mixed integer solver for implementation, but only a nonlinear program (NLP) solver. A rule-based baseline controller is also proposed to aid in comparison. Simulation results show that the proposed RL and MPC controllers achieve similar savings over the baseline controller, about 17%. 
    more » « less