skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Friday, July 12 until 2:00 AM ET on Saturday, July 13 due to maintenance. We apologize for the inconvenience.

Search for: All records

Creators/Authors contains: "Meyn, Sean"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Submitted for publication, and arXiv 2405.17834 
    more » « less
    Free, publicly-accessible full text available June 3, 2025
  2. Submitted for publication, and arXiv preprint arXiv:2403.14109 
    more » « less
    Free, publicly-accessible full text available March 21, 2025
  3. Astolfi, Alessandro (Ed.)
    Q-learning has become an important part of the reinforcement learning toolkit since its introduction in the dissertation of Chris Watkins in the 1980s. In the original tabular formulation, the goal is to compute exactly a solution to the discounted-cost optimality equation, and thereby obtain the optimal policy for a Markov Decision Process. The goal today is more modest: obtain an approximate solution within a prescribed function class. The standard algorithms are based on the same architecture as formulated in the 1980s, with the goal of finding a value function approximation that solves the so-called projected Bellman equation. While reinforcement learning has been an active research area for over four decades, there is little theory providing conditions for convergence of these Q-learning algorithms, or even existence of a solution to this equation. The purpose of this paper is to show that a solution to the projected Bellman equation does exist, provided the function class is linear and the input used for training is a form of epsilon-greedy policy with sufficiently small epsilon. Moreover, under these conditions it is shown that the Q-learning algorithm is stable, in terms of bounded parameter estimates. Convergence remains one of many open topics for research. 
    more » « less
    Free, publicly-accessible full text available January 1, 2025
  4. Editor-in-Chief: George Yin (Ed.)
    This paper presents approaches to mean-field control, motivated by distributed control of multi-agent systems. Control solutions are based on a convex optimization problem, whose domain is a convex set of probability mass functions (pmfs). The main contributions follow: 1. Kullback-Leibler-Quadratic (KLQ) optimal control is a special case, in which the objective function is composed of a control cost in the form of Kullback-Leibler divergence between a candidate pmf and the nominal, plus a quadratic cost on the sequence of marginals. Theory in this paper extends prior work on deterministic control systems, establishing that the optimal solution is an exponential tilting of the nominal pmf. Transform techniques are introduced to reduce complexity of the KLQ solution, motivated by the need to consider time horizons that are much longer than the inter-sampling times required for reliable control. 2. Infinite-horizon KLQ leads to a state feedback control solution with attractive properties. It can be expressed as either state feedback, in which the state is the sequence of marginal pmfs, or an open loop solution is obtained that is more easily computed. 3. Numerical experiments are surveyed in an application of distributed control of residential loads to provide grid services, similar to utility-scale battery storage. The results show that KLQ optimal control enables the aggregate power consumption of a collection of flexible loads to track a time-varying reference signal, while simultaneously ensuring each individual load satisfies its own quality of service constraints. 
    more » « less
    Free, publicly-accessible full text available October 31, 2024
  5. From the summary: The goal of this article is two-fold: survey the emerging theory of QSA (quasi-stochastic approximation) and its implication to design, and explain the intimate connection between QSA and ESC (extremum seeking control). The contributions go in two directions: ESC algorithm design can benefit by applying concepts from QSA theory, and the broader research community with interest in gradient-free optimization can benefit from the control theoretic approach inherent to ESC. 
    more » « less
    Free, publicly-accessible full text available October 1, 2024
  6. Foundational and state-of-the-art anomaly-detection methods through power system state estimation are reviewed. Traditional components for bad data detection, such as chi-square testing, residual-based methods, and hypothesis testing, are discussed to explain the motivations for recent anomaly-detection methods given the increasing complexity of power grids, energy management systems, and cyber-threats. In particular, state estimation anomaly detection based on data-driven quickest-change detection and artificial intelligence are discussed, and directions for research are suggested with particular emphasis on considerations of the future smart grid. 
    more » « less
    Free, publicly-accessible full text available September 1, 2024
  7. Andrea Serrani (Ed.)
    Over the past decade, there has been significant progress on the science of load control for the creation of virtual energy storage. This is an alternative to demand response, and it is termed demand dispatch. Distributed control is used to manage millions of flexible loads to modify the power consumption of the aggregation, which can be ramped up and down, just like discharging and charging a battery. A challenge with distributed control is heterogeneity of the population of loads, which complicates control at the aggregate level. It is shown in this article that additional control at each load in the population can result in a far aggregate model. The local control is designed to flatten resonances and produce approximately all-pass response. Analysis is based on mean-field control for the heterogeneous population; the mean-field model is only justified because of the additional local control introduced in this article. Theory and simulations indicate that the resulting input--output dynamics of the aggregate has a nearly flat input--output response: the behavior of an ideal, multi-GW battery system. 
    more » « less
  8. Alessandro Astolfi (Ed.)
    Demand dispatch is the science of extracting virtual energy storage through the automatic control of deferrable loads to provide balancing or regulation services to the grid, while maintaining consumer-end quality of service.The control of a large collection of heterogeneous loads is in part a resource allocation problem, since different classes of loads are more valuable for different services. The goal of this paper is to unveil the structure of the optimal solution to the resource allocation problem, and investigate short-term market implications. It is found that the marginal cost for each load class evolves in a two-dimensional subspace: spanned by a co-state process and its derivative. The resource allocation problem is recast to construct a dynamic competitive equilibrium model, in which the consumer utility is the negative of the cost of deviation from ideal QoS. It is found that a competitive equilibrium exists with the equilibrium price equal to the negative of an optimal co-state process. Moreover, the equilibrium price is different than what would be obtained based on the standard assumption that the consumer's utility is a function of power consumption. 
    more » « less