skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: ALERA: Accelerated Reinforcement Learning Driven Adaptation to Electro-Mechanical Degradation in Nonlinear Control Systems Using Encoded State Space Error Signatures
The successful deployment of autonomous real-time systems is contingent on their ability to recover from performance degradation of sensors, actuators, and other electro-mechanical subsystems with low latency. In this article, we introduce ALERA, a novel framework for real-time control law adaptation in nonlinear control systems assisted by system state encodings that generate an error signal when the code properties are violated in the presence of failures. The fundamental contributions of this methodology are twofold—first, we show that the time-domain error signal contains perturbed system parameters’ diagnostic information that can be used for quick control law adaptation to failure conditions and second, this quick adaptation is performed via reinforcement learning algorithms that relearn the control law of the perturbed system from a starting condition dictated by the diagnostic information, thus achieving significantly faster recovery. The fast (up to 80X faster than traditional reinforcement learning paradigms) performance recovery enabled by ALERA is demonstrated on an inverted pendulum balancing problem, a brake-by-wire system, and a self-balancing robot.  more » « less
Award ID(s):
1723997
PAR ID:
10175563
Author(s) / Creator(s):
Date Published:
Journal Name:
ACM transactions on intelligent systems and technology
Volume:
10
Issue:
4
ISSN:
2157-6912
Page Range / eLocation ID:
1-25
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Obeid, Iyad Selesnick (Ed.)
    Scalp electroencephalogram (EEG) signals inherently have a low signal-to-noise ratio due to the way the signal is electrically transduced. Temporal and spatial information must be exploited to achieve accurate detection of seizure events. Most popular approaches to seizure detection using deep learning do not jointly model this information or require multiple passes over the signal, which makes the systems inherently non-causal. In this paper, we exploit both simultaneously by converting the multichannel signal to a grayscale image and using transfer learning to achieve high performance. The proposed system is trained end-to-end with only very simple pre- and post-processing operations which are computationally lightweight and have low latency, making them conducive to clinical applications that require real-time processing. We have achieved a performance of 42.05% sensitivity with 5.78 false alarm per 24 hours on the development dataset of v1.5.2 of the Temple University Hospital Seizure Detection Corpus. On a single core CPU operating at 1.7 GHz, the system runs faster than real-time (0.58 xRT), uses 16 Gbytes of memory, and has a latency of 300 msec. 
    more » « less
  2. In feedback control of dynamical systems, the choice of a higher loop gain is typically desirable to achieve a faster closed-loop dynamics, smaller tracking error, and more effective disturbance suppression. Yet, an increased loop gain requires a higher control effort, which can extend beyond the actuation capacity of the feedback system and intermittently cause actuator saturation. To benefit from the advantages of a high feedback gain and simultaneously avoid actuator saturation, this paper advocates a dynamic gain adaptation technique in which the loop gain is lowered whenever necessary to prevent actuator saturation, and is raised again whenever possible. This concept is optimized for linear systems based on an optimal control formulation inspired by the notion of linear quadratic regulator (LQR). The quadratic cost functional adopted in LQR is modified into a certain quasi-quadratic form in which the control cost is dynamically emphasized or deemphasized as a function of the system state. The optimal control law resulted from this quasi-quadratic cost functional is essentially nonlinear, but its structure resembles an LQR with an adaptable gain adjusted by the state of system, aimed to prevent actuator saturation. Moreover, under mild assumptions analogous to those of LQR, this optimal control law is stabilizing. As an illustrative example, application of this optimal control law in feedback design for dc servomotors is examined, and its performance is verified by numerical simulations. 
    more » « less
  3. A deep neural network (DNN)-based adaptive controller with a real-time and concurrent learning (CL)-based adaptive update law is developed for a class of uncertain, nonlinear dynamic systems. The DNN in the control law is used to approximate the uncertain nonlinear dynamic model. The inner-layer weights of the DNN are updated offline using data collected in real-time; whereas, the output-layer DNN weights are updated online (i.e., in real-time) using the Lyapunov- and CL-based adaptation law. Specifically, the inner-layer weights of the DNN are trained offline (concurrent to real-time execution) after a sufficient amount of data is collected in real-time to improve the performance of the system, and after training is completed the inner-layer DNN weights are updated in batch-updates. The key development in this work is that the output-layer DNN update law is augmented with CL-based terms to ensure that the output-layer DNN weight estimates converge to within a ball of their optimal values. A Lyapunov-based stability analysis is performed to ensure semi-global exponential convergence to an ultimate bound for the trajectory tracking errors and the output-layer DNN weight estimation errors. 
    more » « less
  4. We introduce L1-MBRL, a control-theoretic augmentation scheme for Model-Based Reinforcement Learning (MBRL) algorithms. Unlike model-free approaches, MBRL algorithms learn a model of the transition function using data and use it to design a control input. Our approach generates a series of approximate control-affine models of the learned transition function according to the proposed switching law. Using the approximate model, control input produced by the underlying MBRL is perturbed by the L1 adaptive control, which is designed to enhance the robustness of the system against uncertainties. Importantly, this approach is agnostic to the choice of MBRL algorithm, enabling the use of the scheme with various MBRL algorithms. MBRL algorithms with L1 augmentation exhibit enhanced performance and sample efficiency across multiple MuJoCo environments, outperforming the original MBRL algorithms, both with and without system noise. 
    more » « less
  5. Numerous solutions are proposed for the Traffic Signal Control (TSC) tasks aiming to provide efficient transportation and alleviate traffic congestion. Recently, promising results have been attained by Reinforcement Learning (RL) methods through trial and error in simulators, bringing confidence in solving cities' congestion problems. However, performance gaps still exist when simulator-trained policies are deployed to the real world. This issue is mainly introduced by the system dynamic difference between the training simulators and the real-world environments. In this work, we leverage the knowledge of Large Language Models (LLMs) to understand and profile the system dynamics by a prompt-based grounded action transformation to bridge the performance gap. Specifically, this paper exploits the pre-trained LLM's inference ability to understand how traffic dynamics change with weather conditions, traffic states, and road types. Being aware of the changes, the policies' action is taken and grounded based on realistic dynamics, thus helping the agent learn a more realistic policy. We conduct experiments on four different scenarios to show the effectiveness of the proposed PromptGAT's ability to mitigate the performance gap of reinforcement learning from simulation to reality (sim-to-real). 
    more » « less