Title: Conversion of a Class of Stochastic Control Problems to Fundamental-Solution Deterministic Control Problems
A new optimal control based representation for stationary action trajectories is constructed by exploiting connections between semiconvexity, semiconcavity, and stationarity. This new representation is used to verify a known two-point boundary value problem characterization of stationary action. more »« less
Dower, Peter M.; McEneaney, William M.
(, Applied Mathematics & Optimization)
null
(Ed.)
A new optimal control based representation for stationary action trajectories is constructed by exploiting connections between semiconvexity, semiconcavity, and stationarity. This new representation is used to verify a known two-point boundary value problem characterization of stationary action.
Abstract Motivated by the residual type neural networks (ResNet), this paper studies optimal control problems constrained by a non-smooth integral equation associated to a fractional differential equation. Such non-smooth equations, for instance, arise in the continuous representation of fractional deep neural networks (DNNs). Here the underlying non-differentiable function is the ReLU or max function. The control enters in a nonlinear and multiplicative manner and we additionally impose control constraints. Because of the presence of the non-differentiable mapping, the application of standard adjoint calculus is excluded. We derive strong stationary conditions by relying on the limited differentiability properties of the non-smooth map. While traditional approaches smoothen the non-differentiable function, no such smoothness is retained in our final strong stationarity system. Thus, this work also closes a gap which currently exists in continuous neural networks with ReLU type activation function.
Wang, Pengcheng; Han, Feng; Yi, Jingang
(, Journal of Dynamic Systems, Measurement, and Control)
Abstract Bikebot (i.e., bicycle-based robot) is a class of underactuated balance robotic systems that require simultaneous trajectory tracking and balance control tasks. We present a tracking and balance control design of an autonomous bikebot. The external-internal convertible structure of the bikebot dynamics is used to design a causal feedback control to achieve both the tracking and balance tasks. A balance equilibrium manifold is used to define and capture the platform balance profiles and coupled interaction with the trajectory tracking performance. To achieve fully autonomous navigation, a gyrobalancer actuation is integrated with the steering and velocity control for stationary platform balance and stationary-moving switching. Stability and convergence analyses are presented to guarantee the control performance. Extensive experiments are presented to validate and demonstrate the autonomous control design. We also compare the autonomous control performance with human riding experiments and similar action strategies are found between them.
Dower, Peter M.; McEneaney, William M.; Zheng, Y
(, Proceedings of the American Control Conference)
A finite horizon nonlinear optimal control problem is considered for which the associated Hamiltonian satisfies a uniform semiconcavity property with respect to its state and costate variables. It is shown that the value function for this optimal control problem is equivalent to the value of a min-max game, provided the time horizon considered is sufficiently short. This further reduces to maximization of a linear functional over a convex set. It is further proposed that the min-max game can be relaxed to a type of stat (stationary) game, in which no time horizon constraint is involved.
General function approximation is a powerful tool to handle large state and action spaces in a broad range of reinforcement learning (RL) scenarios. However, theoretical understanding of non-stationary MDPs with general function approximation is still limited. In this paper, we make the first such an attempt. We first propose a new complexity metric called dynamic Bellman Eluder (DBE) dimension for non-stationary MDPs, which subsumes majority of existing tractable RL problems in static MDPs as well as non-stationary MDPs. Based on the proposed complexity metric, we propose a novel confidence-set based model-free algorithm called SW-OPEA, which features a sliding window mechanism and a new confidence set design for non-stationary MDPs. We then establish an upper bound on the dynamic regret for the proposed algorithm, and show that SW-OPEA is provably efficient as long as the variation budget is not significantly large. We further demonstrate via examples of non-stationary linear and tabular MDPs that our algorithm performs better in small variation budget scenario than the existing UCB-type algorithms. To the best of our knowledge, this is the first dynamic regret analysis in non-stationary MDPs with general function approximation.
McEneaney, William M, and Dower, Peter M. Conversion of a Class of Stochastic Control Problems to Fundamental-Solution Deterministic Control Problems. Retrieved from https://par.nsf.gov/biblio/10170563. Proceedings of the American Control Conference .
McEneaney, William M, & Dower, Peter M. Conversion of a Class of Stochastic Control Problems to Fundamental-Solution Deterministic Control Problems. Proceedings of the American Control Conference, (). Retrieved from https://par.nsf.gov/biblio/10170563.
McEneaney, William M, and Dower, Peter M.
"Conversion of a Class of Stochastic Control Problems to Fundamental-Solution Deterministic Control Problems". Proceedings of the American Control Conference (). Country unknown/Code not available. https://par.nsf.gov/biblio/10170563.
@article{osti_10170563,
place = {Country unknown/Code not available},
title = {Conversion of a Class of Stochastic Control Problems to Fundamental-Solution Deterministic Control Problems},
url = {https://par.nsf.gov/biblio/10170563},
abstractNote = {A new optimal control based representation for stationary action trajectories is constructed by exploiting connections between semiconvexity, semiconcavity, and stationarity. This new representation is used to verify a known two-point boundary value problem characterization of stationary action.},
journal = {Proceedings of the American Control Conference},
author = {McEneaney, William M and Dower, Peter M.},
}
Warning: Leaving National Science Foundation Website
You are now leaving the National Science Foundation website to go to a non-government website.
Website:
NSF takes no responsibility for and exercises no control over the views expressed or the accuracy of
the information contained on this site. Also be aware that NSF's privacy policy does not apply to this site.