Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Feedback optimization aims at regulating the output of a dynamical system to a value that minimizes a cost function. This problem is beyond the reach of the traditional output regulation theory, because the desired value is generally unknown and the reference signal evolves according to a gradient flow using the system’s real-time output. This paper complements the output regulation theory with the nonlinear small-gain theory to address this challenge. Specifically, the authors assume that the cost function is strongly convex and the nonlinear dynamical system is in lower triangular form and is subject to parametric uncertainties and a class of external disturbances. An internal model is used to compensate for the effects of the disturbances while the cyclic small-gain theorem is invoked to address the coupling between the reference signal, the compensators, and the physical system. The proposed solution can guarantee the boundedness of the closed-loop signals and regulate the output of the system towards the desired minimizer in a global sense. Two numerical examples illustrate the effectiveness of the proposed method.more » « lessFree, publicly-accessible full text available April 1, 2026
-
This paper proposes a novel learning-based adaptive optimal controller design method for a class of continuous-time linear time-delay systems. A key strategy is to exploit the state-of-the-art reinforcement learning (RL) techniques and adaptive dynamic programming (ADP), and propose a data-driven method to learn the near-optimal controller without the precise knowledge of system dynamics. Specifically, a value iteration (VI) algorithm is proposed to solve the infinite-dimensional Riccati equation for the linear quadratic optimal control problem of time-delay systems using finite samples of input-state trajectory data. It is rigorously proved that the proposed VI algorithm converges to the near-optimal solution. Compared with the previous literature, the nice features of the proposed VI algorithm are that it is directly developed for continuous-time systems without discretization and an initial admissible controller is not required for implementing the algorithm. The efficacy of the proposed methodology is demonstrated by two practical examples of metal cutting and autonomous driving.more » « lessFree, publicly-accessible full text available January 1, 2026
-
In this paper, we have proposed a resilient reinforcement learning method for discrete-time linear systems with unknown parameters, under denial-of-service (DoS) attacks. The proposed method is based on policy iteration that learns the optimal controller from input-state data amidst DoS attacks. We achieve an upper bound for the DoS duration to ensure closed-loop stability. The resilience of the closed-loop system, when subjected to DoS attacks with the learned controller and an internal model, has been thoroughly examined. The effectiveness of the proposed methodology is demonstrated on an inverted pendulum on a cart.more » « lessFree, publicly-accessible full text available December 16, 2025
-
This paper studies the distributed feedback optimization problem for linear multi-agent systems without precise knowledge of local costs and agent dynamics. The proposed solution is based on a hierarchical approach that uses upper-level coordinators to adjust reference signals toward the global optimum and lower-level controllers to regulate agents’ outputs toward the reference signals. In the absence of precise information on local gradients and agent dynamics, an extremum-seeking mechanism is used to enforce a gradient descent optimization strategy, and an adaptive dynamic programming approach is taken to synthesize an internal-model-based optimal tracking controller. The whole procedure relies only on measurements of local costs and input-state data along agents’ trajectories. Moreover, under appropriate conditions, the closed-loop signals are bounded and the output of the agents exponentially converges to a small neighborhood of the desired extremum. A numerical example is conducted to validate the efficacy of the proposed method.more » « lessFree, publicly-accessible full text available December 16, 2025
-
This paper studies the effect of perturbations on the gradient flow of a general nonlinear programming problem, where the perturbation may arise from inaccurate gradient estimation in the setting of data-driven optimization. Under suitable conditions on the objective function, the perturbed gradient flow is shown to be small-disturbance input-to-state stable (ISS), which implies that, in the presence of a small-enough perturbation, the trajectories of the perturbed gradient flow must eventually enter a small neighborhood of the optimum. This work was motivated by the question of robustness of direct methods for the linear quadratic regulator problem, and specifically the analysis of the effect of perturbations caused by gradient estimation or round-off errors in policy optimization. We show small-disturbance ISS for three of the most common optimization algorithms: standard gradient flow, natural gradient flow, and Newton gradient flow.more » « less
-
This paper proposes a novel robust reinforcement learning framework for discrete-time linear systems with model mismatch that may arise from the sim-to-real gap. A key strategy is to invoke advanced techniques from control theory. Using the formulation of the classical risk-sensitive linear quadratic Gaussian control, a dual-loop policy optimization algorithm is proposed to generate a robust optimal controller. The dual-loop policy optimization algorithm is shown to be globally and uniformly convergent, and robust against disturbances during the learning process. This robustness property is called small-disturbance input-to-state stability and guarantees that the proposed policy optimization algorithm converges to a small neighborhood of the optimal controller as long as the disturbance at each learning step is relatively small. In addition, when the system dynamics is unknown, a novel model-free off-policy policy optimization algorithm is proposed. Finally, numerical examples are provided to illustrate the proposed algorithm.more » « less
-
In this paper, we solve the optimal output regulation of discrete-time systems without precise knowledge of the system model. Drawing inspiration from reinforcement learning and adaptive dynamic programming, a data-driven solution is developed that enables asymptotic tracking and disturbance rejection. Notably, it is discovered that the proposed approach for discrete-time output regulation differs from the continuous-time approach in terms of the persistent excitation condition required for policy iteration to be unique and convergent. To address this issue, a new persistent excitation condition is introduced to ensure both uniqueness and convergence of the data-driven policy iteration. The efficacy of the proposed methodology is validated by an inverted pendulum on a cart example.more » « less
-
The distributed optimization algorithm proposed by J. Wang and N. Elia in 2010 has been shown to achieve linear convergence for multi-agent systems with single-integrator dynamics. This paper extends their result, including the linear convergence rate, to a more complex scenario where the agents have heterogeneous multi-input multi-output linear dynamics and are subject to external disturbances and parametric uncertainties. Disturbances are dealt with via an internal-modelbased control design, and the interaction among the tracking error dynamics, average dynamics, and dispersion dynamics is analyzed through a composite Lyapunov function and the cyclic small-gain theorem. The key is to ensure a small enough stepsize for the convergence of the proposed algorithm, which is similar to the condition for time-scale separation in singular perturbation theory.more » « less
-
Risk sensitivity is a fundamental aspect of biological motor control that accounts for both the expectation and variability of movement cost in the face of uncertainty. However, most computational models of biological motor control rely on model-based risk-sensitive optimal control, which requires an accurate internal representation in the central neural system to predict the outcomes of motor commands. In reality, the dynamics of human-environment interaction is too complex to be accurately modeled, and noise further complicates system identification. To address this issue, this paper proposes a novel risk-sensitive computational mechanism for biological motor control based on reinforcement learning (RL) and adaptive dynamic programming (ADP). The proposed ADP-based mechanism suggests that humans can directly learn an approximation of the risk-sensitive optimal feedback controller from noisy sensory data without the need for system identification. Numerical validation of the proposed mechanism is conducted on the arm-reaching task under divergent force field. The preliminary computational results align with the experimental observations from the past literature of computational neuroscience.more » « less
-
This paper presents a novel learning-based adaptive optimal controller design for linear time-delay systems described by delay differential equations (DDEs). A key strategy is to exploit the value iteration (VI) approach to solve the linear quadratic optimal control problem for time-delay systems. However, previous learning-based control methods are all exclusively devoted to discrete-time time-delay systems. In this article, we aim to fill in the gap by developing a learning-based VI approach to solve the infinite-dimensional algebraic Riccati equation (ARE) for continuous-time time-delay systems. One nice feature of the proposed VI approach is that an initial admissible controller is not required to start the algorithm. The efficacy of the proposed methodology is demonstrated by the example of autonomous driving.more » « less
An official website of the United States government
