This paper addresses the problem of learning the optimal control policy for a nonlinear stochastic dynam- ical. This problem is subject to the ‘curse of dimension- ality’ associated with the dynamic programming method. This paper proposes a novel decoupled data-based con- trol (D2C) algorithm that addresses this problem using a decoupled, ‘open-loop - closed-loop’, approach. First, an open-loop deterministic trajectory optimization problem is solved using a black-box simulation model of the dynamical system. Then, closed-loop control is developed around this open-loop trajectory by linearization of the dynamics about this nominal trajectory. By virtue of linearization, a linear quadratic regulator based algorithm can be used for this closed-loop control. We show that the performance of D2C algorithm is approximately optimal. Moreover, simulation performance suggests a significant reduction in training time compared to other state of the art algorithms.
more »
« less
Revisiting kinetic Monte Carlo algorithms for time-dependent processes: From open-loop control to feedback control
Simulating stochastic systems with feedback control is challenging due to the complex interplay between the system’s dynamics and the feedback-dependent control protocols. We present a single-step-trajectory probability analysis to time-dependent stochastic systems. Based on this analysis, we revisit several time-dependent kinetic Monte Carlo (KMC) algorithms designed for systems under open-loop-control protocols. Our analysis provides a unified alternative proof to these algorithms, summarized into a pedagogical tutorial. Moreover, with the trajectory probability analysis, we present a novel feedback-controlled KMC algorithm that accurately captures the dynamics systems controlled by an external signal based on the measurements of the system’s state. Our method correctly captures the system dynamics and avoids the artificial Zeno effect that arises from incorrectly applying the direct Gillespie algorithm to feedback-controlled systems. This work provides a unified perspective on existing open-loop-control KMC algorithms and also offers a powerful and accurate tool for simulating stochastic systems with feedback control.
more »
« less
- Award ID(s):
- 2145256
- PAR ID:
- 10575281
- Publisher / Repository:
- AIP
- Date Published:
- Journal Name:
- The Journal of Chemical Physics
- Volume:
- 161
- Issue:
- 4
- ISSN:
- 0021-9606
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
This paper presents convergence analysis of a novel data-driven feedback control algorithm designed for generating online controls based on partial noisy observational data. The algorithm comprises a particle filter-enabled state estimation component, estimating the controlled system’s state via indirect observations, alongside an efficient stochastic maximum principle-type optimal control solver. By integrating weak convergence techniques for the particle filter with convergence analysis for the stochastic maximum principle control solver, we derive a weak convergence result for the optimization procedure in search of optimal data-driven feedback control. Numerical experiments are performed to validate the theoretical findings.more » « less
-
We propose a learning-based robust predictive control algorithm that compensates for significant uncertainty in the dynamics for a class of discrete-time systems that are nominally linear with an additive nonlinear component. Such systems commonly model the nonlinear effects of an unknown environment on a nominal system. We optimize over a class of nonlinear feedback policies inspired by certainty equivalent "estimate-and-cancel" control laws pioneered in classical adaptive control to achieve significant performance improvements in the presence of uncertainties of large magnitude, a setting in which existing learning-based predictive control algorithms often struggle to guarantee safety. In contrast to previous work in robust adaptive MPC, our approach allows us to take advantage of structure (i.e., the numerical predictions) in the a priori unknown dynamics learned online through function approximation. Our approach also extends typical nonlinear adaptive control methods to systems with state and input constraints even when we cannot directly cancel the additive uncertain function from the dynamics. Moreover, we apply contemporary statistical estimation techniques to certify the system’s safety through persistent constraint satisfaction with high probability. Finally, we show in simulation that our method can accommodate more significant unknown dynamics terms than existing methods.more » « less
-
Task-invariant feedback control laws for powered exoskeletons are preferred to assist human users across varying locomotor activities. This goal can be achieved with energy shaping methods, where certain nonlinear partial differential equations, i.e., matching conditions, must be satisfied to find the achievable dynamics. Based on the energy shaping methods, open-loop systems can be mapped to closed-loop systems with a desired analytical expression of energy. In this paper, the desired energy consists of modified potential energy that is well-defined and unified across different contact conditions along with the energy of virtual springs and dampers that improve energy recycling during walking. The human-exoskeleton system achieves the input-output passivity and Lyapunov stability during the whole walking period with the proposed method. The corresponding controller provides assistive torques that closely match the human torques of a simulated biped model and able-bodied human subjects’ data.more » « less
-
In modern machine learning systems, distributed algorithms are deployed across applications to ensure data privacy and optimal utilization of computational resources. This work offers a fresh perspective to model, analyze, and design distributed optimization algorithms through the lens of stochastic multi-rate feedback control. We show that a substantial class of distributed algorithms—including popular Gradient Tracking for decentralized learning, and FedPD and Scaffold for federated learning—can be modeled as a certain discrete-time stochastic feedback-control system, possibly with multiple sampling rates. This key observation allows us to develop a generic framework to analyze the convergence of the entire algorithm class. It also enables one to easily add desirable features such as differential privacy guarantees, or to deal with practical settings such as partial agent participation, communication compression, and imperfect communication in algorithm design and analysis.more » « less
An official website of the United States government

