skip to main content


Title: Adaptive Dynamic Programming for Decentralized Stabilization of Uncertain Nonlinear Large-Scale Systems With Mismatched Interconnections
This paper presents a novel decentralized control strategy for a class of uncertain nonlinear large-scale systems with mismatched interconnections. First, it is shown that the decentralized controller for the overall system can be represented by an array of optimal control policies of auxiliary subsystems. Then, within the framework of adaptive dynamic programming, a simultaneous policy iteration (SPI) algorithm is developed to solve the Hamilton–Jacobi–Bellman equations associated with auxiliary subsystem optimal control policies. The convergence of the SPI algorithm is guaranteed by an equivalence relationship. To implement the present SPI algorithm, actor and critic neural networks are applied to approximate the optimal control policies and the optimal value functions, respectively. Meanwhile, both the least squares method and the Monte Carlo integration technique are employed to derive the unknown weight parameters. Furthermore, by using Lyapunov’s direct method, the overall system with the obtained decentralized controller is proved to be asymptotically stable. Finally, the effectiveness of the proposed decentralized control scheme is illustrated via simulations for nonlinear plants and unstable power systems.  more » « less
Award ID(s):
1731672
NSF-PAR ID:
10065579
Author(s) / Creator(s):
;
Date Published:
Journal Name:
IEEE Transactions on Systems, Man, and Cybernetics: Systems
ISSN:
2168-2216
Page Range / eLocation ID:
1 to 13
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We consider the decentralized control of radial distribution systems with controllable photovoltaic inverters and energy storage resources. For such systems, we investigate the problem of designing fully decentralized controllers that minimize the expected cost of balancing demand, while guaranteeing the satisfaction of individual resource and distribution system voltage constraints. Employing a linear approximation of the branch flow model, we formulate this problem as the design of a decentralized disturbance-feedback controller that minimizes the expected value of a convex quadratic cost function, subject to robust convex quadratic constraints on the system state and input. As such problems are, in general, computationally intractable, we derive a tractable inner approximation to this decentralized control problem, which enables the efficient computation of an affine control policy via the solution of a finite-dimensional conic program. As affine policies are, in general, suboptimal for the family of systems considered, we provide an efficient method to bound their suboptimality via the optimal solution of another finite-dimensional conic program. A case study of a 12 kV radial distribution system demonstrates that decentralized affine controllers can perform close to optimal. 
    more » « less
  2. We consider the decentralized control of radial distribution systems with controllable photovoltaic inverters and storage devices. For such systems, we consider the problem of designing controllers that minimize the expected cost of meeting demand, while respecting distribution system and resource constraints. Employing a linear approximation of the branch flow model, we formulate this problem as the design of a decentralized disturbance-feedback controller that minimizes the expected value of a convex quadratic cost function, subject to convex quadratic constraints on the state and input. As such problems are, in general, computationally intractable, we derive an inner approximation to this decentralized control problem, which enables the efficient computation of an affine control policy via the solution of a conic program. As affine policies are, in general, suboptimal for the systems considered, we provide an efficient method to bound their suboptimality via the solution of another conic program. A case study of a 12 kV radial distribution feeder demonstrates that decentralized affine controllers can perform close to optimal. 
    more » « less
  3. Deep reinforcement learning approaches are becoming appealing for the design of nonlinear controllers for voltage control problems, but the lack of stability guarantees hinders their real-world deployment. This letter constructs a decentralized RL-based controller for inverter-based real-time voltage control in distribution systems. It features two components: a transient control policy and a steady-state performance optimizer. The transient policy is parameterized as a neural network, and the steady-state optimizer represents the gradient of the long-term operating cost function. The two parts are synthesized through a safe gradient flow framework, which prevents the violation of reactive power capacity constraints. We prove that if the output of the transient controller is bounded and monotonically decreasing with respect to its input, then the closed-loop system is asymptotically stable and converges to the optimal steady-state solution. We demonstrate the effectiveness of our method by conducting experiments with IEEE 13-bus and 123-bus distribution system test feeders. 
    more » « less
  4. Abstract

    Shared control of mobile robots integrates manual input with auxiliary autonomous controllers to improve the overall system performance. However, prior work that seeks to find the optimal shared control ratio needs an accurate human model, which is usually challenging to obtain. In this study, the authors develop an extended Twin Delayed Deep Deterministic Policy Gradient (DDPG) (TD3X)‐based shared control framework that learns to assist a human operator in teleoperating mobile robots optimally. The robot's states, shared control ratio in the previous time step, and human's control input is used as inputs to the reinforcement learning (RL) agent, which then outputs the optimal shared control ratio between human input and autonomous controllers without knowing the human model. Noisy softmax policies are developed to make the TD3X algorithm feasible under the constraint of a shared control ratio. Furthermore, to accelerate the training process and protect the robot, a navigation demonstration policy and a safety guard are developed. A neural network (NN) structure is developed to maintain the correlation of sensor readings among heterogeneous input data and improve the learning speed. In addition, an extended DAGGER (DAGGERX) human agent is developed for training the RL agent to reduce human workload. Robot simulations and experiments with humans in the loop are conducted. The results show that the DAGGERX human agent can simulate real human inputs in the worst‐case scenarios with a mean square error of 0.0039. Compared to the original TD3 agent, the TD3X‐based shared control system decreased the average collision number from 387.3 to 44.4 in a simplistic environment and 394.2 to 171.2 in a more complex environment. The maximum average return increased from 1043 to 1187 with a faster converge speed in the simplistic environment, while the performance is equally good in the complex environment because of the use of an advanced human agent. In the human subject tests, participants' average perceived workload was significantly lower in shared control than that in exclusively manual control (26.90 vs. 40.07,p = 0.013).

     
    more » « less
  5. In this paper, we study the event-triggered robust stabilization problem of nonlinear systems subject to mismatched perturbations and input constraints. First, with the introduction of an infinite-horizon cost function for the auxiliary system, we transform the robust stabilization problem into a constrained optimal control problem. Then, we prove that the solution of the event-triggered Hamilton–Jacobi–Bellman (ETHJB) equation, which arises in the constrained optimal control problem, guarantees original system states to be uniformly ultimately bounded (UUB). To solve the ETHJB equation, we present a single network adaptive critic design (SN-ACD). The critic network used in the SN-ACD is tuned through the gradient descent method. By using Lyapunov method, we demonstrate that all the signals in the closed-loop auxiliary system are UUB. Finally, we provide two examples, including the pendulum system, to validate the proposed event-triggered control strategy. 
    more » « less