Deep reinforcement learning approaches are becoming appealing for the design of nonlinear controllers for voltage control problems, but the lack of stability guarantees hinders their real-world deployment. This letter constructs a decentralized RL-based controller for inverter-based real-time voltage control in distribution systems. It features two components: a transient control policy and a steady-state performance optimizer. The transient policy is parameterized as a neural network, and the steady-state optimizer represents the gradient of the long-term operating cost function. The two parts are synthesized through a safe gradient flow framework, which prevents the violation of reactive power capacity constraints. We prove that if the output of the transient controller is bounded and monotonically decreasing with respect to its input, then the closed-loop system is asymptotically stable and converges to the optimal steady-state solution. We demonstrate the effectiveness of our method by conducting experiments with IEEE 13-bus and 123-bus distribution system test feeders.
more »
« less
Deep Policy Gradient for Reactive Power Control in Distribution Systems
Pronounced variability due to the growth of renewable energy sources, flexible loads, and distributed generation is challenging residential distribution systems. This context, motivates well fast, efficient, and robust reactive power control. Optimal reactive power control is possible in theory by solving a non-convex optimization problem based on the exact model of distribution flow. However, lack of high-precision instrumentation and reliable communications, as well as the heavy computational burden of non-convex optimization solvers render computing and implementing the optimal control challenging in practice. Taking a statistical learning viewpoint, the input-output relationship between each grid state and the corresponding optimal reactive power control (a.k.a., policy) is parameterized in the present work by a deep neural network, whose unknown weights are updated by minimizing the accumulated power loss over a number of historical and simulated training pairs, using the policy gradient method. In the inference phase, one just feeds the real-time state vector into the learned neural network to obtain the ‘optimal’ reactive power control decision with only several matrix-vector multiplications. The merits of this novel deep policy gradient approach include its computational efficiency as well as robustness to random input perturbations. Numerical tests on a 47-bus distribution network using real solar and consumption data corroborate these practical merits.
more »
« less
- Award ID(s):
- 1901134
- PAR ID:
- 10273949
- Date Published:
- Journal Name:
- Proceedings of IEEE Smartgridcom Conference
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Distributed feedback design and complexity constrained control are examples of problems posed within the domain of structured optimal feedback synthesis. The optimal feedback gain is typically a non-convex function of system primitives. However, in recent years, algorithms have been proposed to obtain locally optimal solutions. In applications to large-scale distributed control, the major obstacle is computational complexity. This paper addresses complexity through a combination of linear-algebraic techniques and computational methods adapted from both machine learning and reinforcement learning. It is shown that for general classes of optimal control problems, the objective function and its gradient can be computed from data. Transformations borrowed from the theory of reinforcement learning are adapted to obtain simulation-based algorithms for computing the structured optimal H2 feedback gain. Customized proximal algorithms based on gradient descent and incremental gradient are tested in computational experiments and their relative merits are discussed.more » « less
-
Nonlinear optimal control problems are challenging to solve efficiently due to non-convexity. This paper introduces a trajectory optimization approach that achieves real-time performance by combining machine learning to predict optimal trajectories with refinement by quadratic optimization. First, a library of optimal trajectories is calculated offline and used to train a neural network. Online, the neural network predicts a trajectory for a novel initial state and cost function, and this prediction is further optimized by a sparse quadratic programming solver. We apply this approach to a fly-to-target movement problem for an indoor quadrotor. Experiments demonstrate that the technique calculates near-optimal trajectories in a few milliseconds, and generates agile movement that can be tracked more accurately than existing methods.more » « less
-
Dispatching a large fleet of distributed energy resources (DERs) in response to wholesale energy market or regional grid signals requires solving a challenging disaggregation problem when the DERs are located within a distribution network. This manuscript presents a computationally tractable convex inner approximation for the optimal power flow (OPF) problem that characterizes a feeders aggregate DERs hosting capacity and enables a realtime, grid-aware dispatch of DERs for radial distribution networks. The inner approximation is derived by considering convex envelopes on the nonlinear terms in the AC power flow equations. The resulting convex formulation is then used to derive provable nodal injection limits, such that any combination of DER dispatches within their respective nodal limits is guaranteed to be AC admissible. These nodal injection limits are then used to construct a realtime, open-loop control policy for dispatching DERs at each location in the network to collectively deliver grid services. The IEEE-37 distribution network is used to validate the technical results and highlight various use-cases.more » « less
-
Feature representations from pre-trained deep neural networks have been known to exhibit excellent generalization and utility across a variety of related tasks. Fine-tuning is by far the simplest and most widely used approach that seeks to exploit and adapt these feature representations to novel tasks with limited data. Despite the effectiveness of fine-tuning, itis often sub-optimal and requires very careful optimization to prevent severe over-fitting to small datasets. The problem of sub-optimality and over-fitting, is due in part to the large number of parameters used in a typical deep convolutional neural network. To address these problems, we propose a simple yet effective regularization method for fine-tuning pre-trained deep networks for the task of k-shot learning. To prevent overfitting, our key strategy is to cluster the model parameters while ensuring intra-cluster similarity and inter-cluster diversity of the parameters, effectively regularizing the dimensionality of the parameter search space. In particular, we identify groups of neurons within each layer of a deep network that shares similar activation patterns. When the network is to be fine-tuned for a classification task using only k examples, we propagate a single gradient to all of the neuron parameters that belong to the same group. The grouping of neurons is non-trivial as neuron activations depend on the distribution of the input data. To efficiently search for optimal groupings conditioned on the input data, we propose a reinforcement learning search strategy using recurrent networks to learn the optimal group assignments for each network layer. Experimental results show that our method can be easily applied to several popular convolutional neural networks and improve upon other state-of-the-art fine-tuning based k-shot learning strategies by more than10%more » « less