Deep reinforcement learning approaches are becoming appealing for the design of nonlinear controllers for voltage control problems, but the lack of stability guarantees hinders their real-world deployment. This letter constructs a decentralized RL-based controller for inverter-based real-time voltage control in distribution systems. It features two components: a transient control policy and a steady-state performance optimizer. The transient policy is parameterized as a neural network, and the steady-state optimizer represents the gradient of the long-term operating cost function. The two parts are synthesized through a safe gradient flow framework, which prevents the violation of reactive power capacity constraints. We prove that if the output of the transient controller is bounded and monotonically decreasing with respect to its input, then the closed-loop system is asymptotically stable and converges to the optimal steady-state solution. We demonstrate the effectiveness of our method by conducting experiments with IEEE 13-bus and 123-bus distribution system test feeders.
more »
« less
Deep Policy Gradient for Reactive Power Control in Distribution Systems
Pronounced variability due to the growth of renewable energy sources, flexible loads, and distributed generation is challenging residential distribution systems. This context, motivates well fast, efficient, and robust reactive power control. Optimal reactive power control is possible in theory by solving a non-convex optimization problem based on the exact model of distribution flow. However, lack of high-precision instrumentation and reliable communications, as well as the heavy computational burden of non-convex optimization solvers render computing and implementing the optimal control challenging in practice. Taking a statistical learning viewpoint, the input-output relationship between each grid state and the corresponding optimal reactive power control (a.k.a., policy) is parameterized in the present work by a deep neural network, whose unknown weights are updated by minimizing the accumulated power loss over a number of historical and simulated training pairs, using the policy gradient method. In the inference phase, one just feeds the real-time state vector into the learned neural network to obtain the ‘optimal’ reactive power control decision with only several matrix-vector multiplications. The merits of this novel deep policy gradient approach include its computational efficiency as well as robustness to random input perturbations. Numerical tests on a 47-bus distribution network using real solar and consumption data corroborate these practical merits.
more »
« less
- Award ID(s):
- 1901134
- PAR ID:
- 10273949
- Date Published:
- Journal Name:
- Proceedings of IEEE Smartgridcom Conference
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Data‐enabled predictive control (DeePC) is a data‐driven control algorithm that utilizes data matrices to form a non‐parametric representation of the underlying system, predicting future behaviors and generating optimal control actions. DeePC typically requires solving an online optimization problem, the complexity of which is heavily influenced by the amount of data used, potentially leading to expensive online computation. In this article, we leverage deep learning to propose a highly computationally efficient DeePC approach for general nonlinear processes, referred to as Deep DeePC. Specifically, a deep neural network is employed to learn the DeePC vector operator, which is an essential component of the non‐parametric representation of DeePC. This neural network is trained offline using historical open‐loop input and output data of the nonlinear process. With the trained neural network, the Deep DeePC framework is formed for online control implementation. At each sampling instant, this neural network directly outputs the DeePC operator, eliminating the need for online optimization as conventional DeePC. The optimal control action is obtained based on the DeePC operator updated by the trained neural network. To address constrained scenarios, a constraint handling scheme is further proposed and integrated with the Deep DeePC to handle hard constraints during online implementation. The efficacy and superiority of the proposed Deep DeePC approach are demonstrated using two benchmark process examples.more » « less
-
Ozay, Necmiye; Balzano, Laura; Panagou, Dimitra; Abate, Alessandro (Ed.)Many optimal and robust control problems are nonconvex and potentially nonsmooth in their policy optimization forms. In this paper, we introduce the Extended Convex Lifting (ECL) framework, which reveals hidden convexity in classical optimal and robust control problems from a modern optimization perspective. Our ECL framework offers a bridge between nonconvex policy optimization and convex reformulations. Despite non-convexity and non-smoothness, the existence of an ECL for policy optimization not only reveals that the policy optimization problem is equivalent to a convex problem, but also certifies a class of first-order non-degenerate stationary points to be globally optimal. We further show that this ECL framework encompasses many benchmark control problems, including LQR, state-feedback and output-feedback H-infinity robust control. We believe that ECL will also be of independent interest for analyzing nonconvex problems beyond control.more » « less
-
Distributed feedback design and complexity constrained control are examples of problems posed within the domain of structured optimal feedback synthesis. The optimal feedback gain is typically a non-convex function of system primitives. However, in recent years, algorithms have been proposed to obtain locally optimal solutions. In applications to large-scale distributed control, the major obstacle is computational complexity. This paper addresses complexity through a combination of linear-algebraic techniques and computational methods adapted from both machine learning and reinforcement learning. It is shown that for general classes of optimal control problems, the objective function and its gradient can be computed from data. Transformations borrowed from the theory of reinforcement learning are adapted to obtain simulation-based algorithms for computing the structured optimal H2 feedback gain. Customized proximal algorithms based on gradient descent and incremental gradient are tested in computational experiments and their relative merits are discussed.more » « less
-
Nonlinear optimal control problems are challenging to solve efficiently due to non-convexity. This paper introduces a trajectory optimization approach that achieves real-time performance by combining machine learning to predict optimal trajectories with refinement by quadratic optimization. First, a library of optimal trajectories is calculated offline and used to train a neural network. Online, the neural network predicts a trajectory for a novel initial state and cost function, and this prediction is further optimized by a sparse quadratic programming solver. We apply this approach to a fly-to-target movement problem for an indoor quadrotor. Experiments demonstrate that the technique calculates near-optimal trajectories in a few milliseconds, and generates agile movement that can be tracked more accurately than existing methods.more » « less
An official website of the United States government

