skip to main content

Title: Deep reinforcement learning for modeling human locomotion control in neuromechanical simulation
Abstract Modeling human motor control and predicting how humans will move in novel environments is a grand scientific challenge. Researchers in the fields of biomechanics and motor control have proposed and evaluated motor control models via neuromechanical simulations, which produce physically correct motions of a musculoskeletal model. Typically, researchers have developed control models that encode physiologically plausible motor control hypotheses and compared the resulting simulation behaviors to measurable human motion data. While such plausible control models were able to simulate and explain many basic locomotion behaviors (e.g. walking, running, and climbing stairs), modeling higher layer controls (e.g. processing environment cues, planning long-term motion strategies, and coordinating basic motor skills to navigate in dynamic and complex environments) remains a challenge. Recent advances in deep reinforcement learning lay a foundation for modeling these complex control processes and controlling a diverse repertoire of human movement; however, reinforcement learning has been rarely applied in neuromechanical simulation to model human control. In this paper, we review the current state of neuromechanical simulations, along with the fundamentals of reinforcement learning, as it applies to human locomotion. We also present a scientific competition and accompanying software platform, which we have organized to accelerate the use of reinforcement more » learning in neuromechanical simulations. This “Learn to Move” competition was an official competition at the NeurIPS conference from 2017 to 2019 and attracted over 1300 teams from around the world. Top teams adapted state-of-the-art deep reinforcement learning techniques and produced motions, such as quick turning and walk-to-stand transitions, that have not been demonstrated before in neuromechanical simulations without utilizing reference motion data. We close with a discussion of future opportunities at the intersection of human movement simulation and reinforcement learning and our plans to extend the Learn to Move competition to further facilitate interdisciplinary collaboration in modeling human motor control for biomechanics and rehabilitation research « less
; ; ; ; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
Journal of NeuroEngineering and Rehabilitation
Sponsoring Org:
National Science Foundation
More Like this
  1. Baden, Tom (Ed.)
    Animals modulate sensory processing in concert with motor actions. Parallel copies of motor signals, called corollary discharge (CD), prepare the nervous system to process the mixture of externally and self-generated (reafferent) feedback that arises during locomotion. Commonly, CD in the peripheral nervous system cancels reafference to protect sensors and the central nervous system from being fatigued and overwhelmed by self-generated feedback. However, cancellation also limits the feedback that contributes to an animal’s awareness of its body position and motion within the environment, the sense of proprioception. We propose that, rather than cancellation, CD to the fish lateral line organ restructures reafference to maximize proprioceptive information content. Fishes’ undulatory body motions induce reafferent feedback that can encode the body’s instantaneous configuration with respect to fluid flows. We combined experimental and computational analyses of swimming biomechanics and hair cell physiology to develop a neuromechanical model of how fish can track peak body curvature, a key signature of axial undulatory locomotion. Without CD, this computation would be challenged by sensory adaptation, typified by decaying sensitivity and phase distortions with respect to an input stimulus. We find that CD interacts synergistically with sensor polarization to sharpen sensitivity along sensors’ preferred axes. The sharpening ofmore »sensitivity regulates spiking to a narrow interval coinciding with peak reafferent stimulation, which prevents adaptation and homogenizes the otherwise variable sensor output. Our integrative model reveals a vital role of CD for ensuring precise proprioceptive feedback during undulatory locomotion, which we term external proprioception.« less
  2. Sensory feedback during movement entails sensing a mix of externally- and self-generated stimuli (respectively, exafference and reafference). In many peripheral sensory systems, a parallel copy of the motor command, a corollary discharge, is thought to eliminate sensory feedback during behaviors. However, reafference has important roles in motor control, because it provides real-time feedback on the animal’s motions through the environment. In this case, the corollary discharge must be calibrated to enable feedback while avoiding negative consequences like sensor fatigue. The undulatory motions of fishes’ bodies generate induced flows that are sensed by the lateral line sensory organ, and prior work has shown these reafferent signals contribute to the regulation of swimming kinematics. Corollary discharge to the lateral line reduces the gain for reafference, but cannot eliminate it altogether. We develop a data-driven model integrating swimming biomechanics, hair cell physiology, and corollary discharge to understand how sensory modulation is calibrated during locomotion in larval zebrafish. In the absence of corollary discharge, lateral line afferent units exhibit the highly heterogeneous habituation rates characteristic of hair cell systems, typified by decaying sensitivity and phase distortions with respect to an input stimulus. Activation of the corollary discharge prevents habituation, reduces response heterogeneity, and regulatesmore »response phases in a narrow interval around the time of the peak stimulus. This suggests a synergistic interaction between the corollary discharge and the polarization of lateral line sensors, which sharpens sensitivity along their preferred axes. Our integrative model reveals a vital role of corollary discharge for ensuring precise feedback, including proprioception, during undulatory locomotion.« less
  3. Computer modeling, simulation and optimization are powerful tools that have seen increased use in biomechanics research. Dynamic optimizations can be categorized as either data-tracking or predictive problems. The data-tracking approach has been used extensively to address human movement problems of clinical relevance. The predictive approach also holds great promise, but has seen limited use in clinical applications. Enhanced software tools would facilitate the application of predictive musculoskeletal simulations to clinically-relevant research. The open-source software OpenSim provides tools for generating tracking simulations but not predictive simulations. However, OpenSim includes an extensive application programming interface that permits extending its capabilities with scripting languages such as MATLAB. In the work presented here, we combine the computational tools provided by MATLAB with the musculoskeletal modeling capabilities of OpenSim to create a framework for generating predictive simulations of musculoskeletal movement based on direct collocation optimal control techniques. In many cases, the direct collocation approach can be used to solve optimal control problems considerably faster than traditional shooting methods. Cyclical and discrete movement problems were solved using a simple 1 degree of freedom musculoskeletal model and a model of the human lower limb, respectively. The problems could be solved in reasonable amounts of time (several secondsmore »to 1–2 hours) using the open-source IPOPT solver. The problems could also be solved using the fmincon solver that is included with MATLAB, but the computation times were excessively long for all but the smallest of problems. The performance advantage for IPOPT was derived primarily by exploiting sparsity in the constraints Jacobian. The framework presented here provides a powerful and flexible approach for generating optimal control simulations of musculoskeletal movement using OpenSim and MATLAB. This should allow researchers to more readily use predictive simulation as a tool to address clinical conditions that limit human mobility.

    « less
  4. Abstract Insects are highly capable walkers, but many questions remain regarding how the insect nervous system controls locomotion. One particular question is how information is communicated between the ‘lower level’ ventral nerve cord (VNC) and the ‘higher level’ head ganglia to facilitate control. In this work, we seek to explore this question by investigating how systems traditionally described as ‘positive feedback’ may initiate and maintain stepping in the VNC with limited information exchanged between lower and higher level centers. We focus on the ‘reflex reversal’ of the stick insect femur-tibia joint between a resistance reflex (RR) and an active reaction in response to joint flexion, as well as the activation of populations of descending dorsal median unpaired (desDUM) neurons from limb strain as our primary reflex loops. We present the development of a neuromechanical model of the stick insect ( Carausius morosus ) femur-tibia (FTi) and coxa-trochanter joint control networks ‘in-the-loop’ with a physical robotic limb. The control network generates motor commands for the robotic limb, whose motion and forces generate sensory feedback for the network. We based our network architecture on the anatomy of the non-spiking interneuron joint control network that controls the FTi joint, extrapolated network connectivity based on known muscle responses,more »and previously developed mechanisms to produce ‘sideways stepping’. Previous studies hypothesized that RR is enacted by selective inhibition of sensory afferents from the femoral chordotonal organ, but no study has tested this hypothesis with a model of an intact limb. We found that inhibiting the network’s flexion position and velocity afferents generated a reflex reversal in the robot limb’s FTi joint. We also explored the intact network’s ability to sustain steady locomotion on our test limb. Our results suggested that the reflex reversal and limb strain reinforcement mechanisms are both necessary but individually insufficient to produce and maintain rhythmic stepping in the limb, which can be initiated or halted by brief, transient descending signals. Removing portions of this feedback loop or creating a large enough disruption can halt stepping independent of the higher-level centers. We conclude by discussing why the nervous system might control motor output in this manner, as well as how to apply these findings to generalized nervous system understanding and improved robotic control.« less
  5. The primary goal of an assist-as-needed (AAN) controller is to maximize subjects' active participation during motor training tasks while allowing moderate tracking errors to encourage human learning of a target movement. Impedance control is typically employed by AAN controllers to create a compliant force-field around the desired motion trajectory. To accommodate different individuals with varying motor abilities, most of the existing AAN controllers require extensive manual tuning of the control parameters, resulting in a tedious and time-consuming process. In this paper, we propose a reinforcement learning AAN controller that can autonomously reshape the force-field in real-time based on subjects' training performances. The use of action-dependent heuristic dynamic programming enables a model-free implementation of the proposed controller. To experimentally validate the controller, a group of healthy individuals participated in a gait training session wherein they were asked to learn a modified gait pattern with the help of a powered ankle-foot orthosis. Results indicated the potential of the proposed control strategy for robot-assisted gait training.