skip to main content


Title: A finite horizon optimal stochastic impulse control problem with a decision lag
This paper studies an optimal stochastic impulse control problem in a finite time horizon with a decision lag, by which we mean that after an impulse is made, a fixed number units of time has to be elapsed before the next impulse is allowed to be made. The continuity of the value function is proved. A suitable version of dynamic programming principle is established, which takes into account the dependence of state process on the elapsed time. The corresponding Hamilton-Jacobi-Bellman (HJB) equation is derived, which exhibits some special feature of the problem. The value function of this optimal impulse control problem is characterized as the unique viscosity solution to the corresponding HJB equation. An optimal impulse control is constructed provided the value function is given. Moreover, a limiting case with the waiting time approaching 0 is discussed.  more » « less
Award ID(s):
1812921
NSF-PAR ID:
10220019
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Dynamics of continuous discrete and impulsive systems
Volume:
28
ISSN:
1918-2538
Page Range / eLocation ID:
89-123
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Buttazzo, G. ; Casas, E. ; de Teresa, L. ; Glowinski, R. ; Leugering, G. ; Trélat, E. ; Zhang, X. (Ed.)
    An optimal control problem is considered for a stochastic differential equation with the cost functional determined by a backward stochastic Volterra integral equation (BSVIE, for short). This kind of cost functional can cover the general discounting (including exponential and non-exponential) situations with a recursive feature. It is known that such a problem is time-inconsistent in general. Therefore, instead of finding a global optimal control, we look for a time-consistent locally near optimal equilibrium strategy. With the idea of multi-person differential games, a family of approximate equilibrium strategies is constructed associated with partitions of the time intervals. By sending the mesh size of the time interval partition to zero, an equilibrium Hamilton–Jacobi–Bellman (HJB, for short) equation is derived, through which the equilibrium value function and an equilibrium strategy are obtained. Under certain conditions, a verification theorem is proved and the well-posedness of the equilibrium HJB is established. As a sort of Feynman–Kac formula for the equilibrium HJB equation, a new class of BSVIEs (containing the diagonal value Z ( r , r ) of Z (⋅ , ⋅)) is naturally introduced and the well-posedness of such kind of equations is briefly presented. 
    more » « less
  2. We propose a neural network approach that yields approximate solutions for high-dimensional optimal control problems and demonstrate its effectiveness using examples from multi-agent path finding. Our approach yields controls in a feedback form, where the policy function is given by a neural network (NN). Specifically, we fuse the Hamilton-Jacobi-Bellman (HJB) and Pontryagin Maximum Principle (PMP) approaches by parameterizing the value function with an NN. Our approach enables us to obtain approximately optimal controls in real-time without having to solve an optimization problem. Once the policy function is trained, generating a control at a given space-time location takes milliseconds; in contrast, efficient nonlinear programming methods typically perform the same task in seconds. We train the NN offline using the objective function of the control problem and penalty terms that enforce the HJB equations. Therefore, our training algorithm does not involve data generated by another algorithm. By training on a distribution of initial states, we ensure the controls' optimality on a large portion of the state-space. Our grid-free approach scales efficiently to dimensions where grids become impractical or infeasible. We apply our approach to several multi-agent collision-avoidance problems in up to 150 dimensions. Furthermore, we empirically observe that the number of parameters in our approach scales linearly with the dimension of the control problem, thereby mitigating the curse of dimensionality. 
    more » « less
  3. An optimal control problem is considered for a stochastic differential equation containing a state-dependent regime switching, with a recursive cost functional. Due to the non-exponential discounting in the cost functional, the problem is time-inconsistent in general. Therefore, instead of finding a global optimal control (which is not possible), we look for a time-consistent (approximately) locally optimal equilibrium strategy. Such a strategy can be represented through the solution to a system of partial differential equations, called an equilibrium Hamilton–Jacob–Bellman (HJB) equation which is constructed via a sequence of multi-person differential games. A verification theorem is proved and, under proper conditions, the well-posedness of the equilibrium HJB equation is established as well. 
    more » « less
  4. Deterministic optimal impulse control problem with terminal state constraint is considered. Due to the appearance of the terminal state constraint, the value function might be discontinuous in general. The main contribution of this paper is the introduction of an intrinsic condition under which the value function is proved to be continuous. Then by a Bellman dynamic programming principle, the corresponding Hamilton-Jacobi-Bellman type quasi-variational inequality (QVI, for short) is derived. The value function is proved to be a viscosity solution to such a QVI. The issue of whether the value function is characterized as the unique viscosity solution to this QVI is carefully addressed and the answer is left open challengingly. 
    more » « less
  5. The vortex dynamics and lift force generated by a sinusoidally heaving and pitching airfoil during dynamic stall are experimentally investigated for reduced frequencies of k = fc=U1 = 0:06􀀀0:16, pitching amplitude of 0 = 75 and heaving amplitude of h0=c = 0:6. The lift force is calculated from the velocity fi elds using the nite-domain impulse theory. The concept of moment arm dilemma associated with the impulse equation is revisited to shed-light on its physical impact on the calculated forces. It is shown that by selecting an objectively de ned origin of the moment-arm, the impulse force equation can be greatly simpli ed to two terms that have a clear physical meaning: (i) the time rate of change of impulse of vortical structures within the control volume and (ii) Lamb vector that indirectly captures the contribution of vortical structures outside of the control volume. The results show that the trend of the lift force is dependent on the formation of the leading edge vortex, as well as its time rate of change of circulation and chord-wise advection relative to the airfoil. Additionally, the trailing edge vortex, which is observed to only form for k  0:10, is shown to have lift-diminishing e ects that intensi es with increasing reduced frequency. Lastly, the concept of optimal vortex formation is investigated. The leading edge vortex is shown to attain the optimal formation number of approximately 4 for k  0:1, when the scaling is based on the leading edge shear velocity. For larger values of k the vortex growth is delayed to later in the cycle and doesn't reach its optimal value. The result is that the peak lift force occurs later in the cycle. This has consequences on power production which relies on correlation of the relative timing of lift force and heaving velocity. 
    more » « less