 Award ID(s):
 1908918
 NSFPAR ID:
 10170563
 Date Published:
 Journal Name:
 Proceedings of the American Control Conference
 ISSN:
 07431619
 Page Range / eLocation ID:
 17791784
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this

null (Ed.)A new optimal control based representation for stationary action trajectories is constructed by exploiting connections between semiconvexity, semiconcavity, and stationarity. This new representation is used to verify a known twopoint boundary value problem characterization of stationary action.more » « less

Abstract Bikebot (i.e., bicyclebased robot) is a class of underactuated balance robotic systems that require simultaneous trajectory tracking and balance control tasks. We present a tracking and balance control design of an autonomous bikebot. The externalinternal convertible structure of the bikebot dynamics is used to design a causal feedback control to achieve both the tracking and balance tasks. A balance equilibrium manifold is used to define and capture the platform balance profiles and coupled interaction with the trajectory tracking performance. To achieve fully autonomous navigation, a gyrobalancer actuation is integrated with the steering and velocity control for stationary platform balance and stationarymoving switching. Stability and convergence analyses are presented to guarantee the control performance. Extensive experiments are presented to validate and demonstrate the autonomous control design. We also compare the autonomous control performance with human riding experiments and similar action strategies are found between them.

A finite horizon nonlinear optimal control problem is considered for which the associated Hamiltonian satisfies a uniform semiconcavity property with respect to its state and costate variables. It is shown that the value function for this optimal control problem is equivalent to the value of a minmax game, provided the time horizon considered is sufficiently short. This further reduces to maximization of a linear functional over a convex set. It is further proposed that the minmax game can be relaxed to a type of stat (stationary) game, in which no time horizon constraint is involved.more » « less

We consider the linear third order (in time) PDE known as the SMGTJequation, defined on a bounded domain, under the action of either Dirichlet or Neumann boundary control
. Optimal interior and boundary regularity results were given in [\begin{document}$ g $\end{document} 1 ], after [41 ], when , which, moreover, in the canonical case\begin{document}$ g \in L^2(0, T;L^2(\Gamma)) \equiv L^2(\Sigma) $\end{document} , were expressed by the wellknown explicit representation formulae of the wave equation in terms of cosine/sine operators [\begin{document}$ \gamma = 0 $\end{document} 19 ], [17 ], [24 ,Vol Ⅱ]. The interior or boundary regularity theory is however the same, whether or\begin{document}$ \gamma = 0 $\end{document} , since\begin{document}$ 0 \neq \gamma \in L^{\infty}(\Omega) $\end{document} is responsible only for lower order terms. Here we exploit such cosine operator basedexplicit representation formulae to provide optimal interior and boundary regularity results with\begin{document}$ \gamma \neq 0 $\end{document} "smoother" than\begin{document}$ g $\end{document} , qualitatively by one unit, two units, etc. in the Dirichlet boundary case. To this end, we invoke the corresponding results for wave equations, as in [\begin{document}$ L^2(\Sigma) $\end{document} 17 ]. Similarly for the Neumann boundary case, by invoking the corresponding results for the wave equation as in [22 ], [23 ], [37 ] for control smoother than , and [\begin{document}$ L^2(0, T;L^2(\Gamma)) $\end{document} 44 ] for control less regular in space than . In addition, we provide optimal interior and boundary regularity results when the SMGTJ equation is subject to interior point control, by invoking the corresponding wave equations results [\begin{document}$ L^2(\Gamma) $\end{document} 42 ], [24 ,Section 9.8.2]. 
Krause, Andreas and (Ed.)General function approximation is a powerful tool to handle large state and action spaces in a broad range of reinforcement learning (RL) scenarios. However, theoretical understanding of nonstationary MDPs with general function approximation is still limited. In this paper, we make the first such an attempt. We first propose a new complexity metric called dynamic Bellman Eluder (DBE) dimension for nonstationary MDPs, which subsumes majority of existing tractable RL problems in static MDPs as well as nonstationary MDPs. Based on the proposed complexity metric, we propose a novel confidenceset based modelfree algorithm called SWOPEA, which features a sliding window mechanism and a new confidence set design for nonstationary MDPs. We then establish an upper bound on the dynamic regret for the proposed algorithm, and show that SWOPEA is provably efficient as long as the variation budget is not significantly large. We further demonstrate via examples of nonstationary linear and tabular MDPs that our algorithm performs better in small variation budget scenario than the existing UCBtype algorithms. To the best of our knowledge, this is the first dynamic regret analysis in nonstationary MDPs with general function approximation.more » « less