Title: Data-Driven Optimal Control of Traffic Signals for Urban Road Networks
This paper studies the issue of data-driven optimal control design for traffic signals of oversaturated urban road networks. The signal control system based on the store and forward model is generally uncontrollable for which the controllable decomposition is needed. Instead of identifying the unknown parameters like saturation rates and turning ratios, a finite number of measured trajectories can be used to parametrize the system and help directly construct a transformation matrix for Kalman controllable decomposition through the fundamental lemma of J. C. Willems. On top of that, an infinite-horizon linear quadratic regulator (LQR) problem is formulated considering the constraints of green times for traffic signals. The problem can be solved through a two-phase data-driven learning process, where one solves an infinite-horizon unconstrained LQR problem and the other solves a finite-horizon constrained LQR problem. The simulation result shows the theoretical analysis is effective and the proposed data-driven controller can yield desired performance for reducing traffic congestion. more »« less
Pang, Bo; Jiang, Zhong-Ping
(, Proc. 59th IEEE Conference on Decision and Control)
null
(Ed.)
This paper presents a data-driven algorithm to solve the problem of infinite-horizon linear quadratic regulation (LQR), for a class of discrete-time linear time-invariant systems subjected to state and control constraints. The problem is divided into a constrained finite-horizon LQR subproblem and an unconstrained infinite-horizon LQR subproblem, which can be solved directly from collected input/state data, separately. Under certain conditions, the combination of the solutions of the subproblems converges to the optimal solution of the original problem. The effectiveness of the proposed approach is validated by a numerical example.
Sun, Yue; Fazel, Maryam
(, 60th IEEE Conference on Decision and Control (CDC),)
Common reinforcement learning methods seek optimal controllers for unknown dynamical systems by searching in the "policy" space directly. A recent line of research, starting with [1], aims to provide theoretical guarantees for such direct policy-update methods by exploring their performance in classical control settings, such as the infinite horizon linear quadratic regulator (LQR) problem. A key property these analyses rely on is that the LQR cost function satisfies the "gradient dominance" property with respect to the policy parameters. Gradient dominance helps guarantee that the optimal controller can be found by running gradient-based algorithms on the LQR cost. The gradient dominance property has so far been verified on a case-by-case basis for several control problems including continuous/discrete time LQR, LQR with decentralized controller, H2/H∞ robust control.In this paper, we make a connection between this line of work and classical convex parameterizations based on linear matrix inequalities (LMIs). Using this, we propose a unified framework for showing that gradient dominance indeed holds for a broad class of control problems, such as continuous- and discrete-time LQR, minimizing the L2 gain, and problems using system-level parameterization. Our unified framework provides insights into the landscape of the cost function as a function of the policy, and enables extending convergence results for policy gradient descent to a much larger class of problems.
For energy-efficient Connected and Automated Vehicle (CAV) Eco-driving control on signalized arterials under uncertain traffic conditions, this paper explicitly considers traffic control devices (e.g., road markings, traffic signs, and traffic signals) and road geometry (e.g., road shapes, road boundaries, and road grades) constraints in a data-driven optimization-based Model Predictive Control (MPC) modeling framework. This modeling framework uses real-time vehicle driving and traffic signal data via Vehicle-to-Infrastructure (V2I) and Vehicle-to-Vehicle (V2V) communications. In the MPC-based control model, this paper mathematically formulates location-based traffic control devices and road geometry constraints using the geographic information from High-Definition (HD) maps. The location-based traffic control devices and road geometry constraints have the potential to improve the safety, energy, efficiency, driving comfort, and robustness of connected and automated driving on real roads by considering interrupted flow facility locations and road geometry in the formulation. We predict a set of uncertain driving states for the preceding vehicles through an online learning-based driving dynamics prediction model. We then solve a constrained finite-horizon optimal control problem with the predicted driving states to obtain a set of Eco-driving references for the controlled vehicle. To obtain the optimal acceleration or deceleration commands for the controlled vehicle with the set of Eco-driving references, we formulate a Distributionally Robust Stochastic Optimization (DRSO) model (i.e., a special case of data-driven optimization models under moment bounds) with Distributionally Robust Chance Constraints (DRCC) with location-based traffic control devices and road geometry constraints. We design experiments to demonstrate the proposed model under different traffic conditions using real-world connected vehicle trajectory data and Signal Phasing and Timing (SPaT) data on a coordinated arterial with six actuated intersections on Fuller Road in Ann Arbor, Michigan from the Safety Pilot Model Deployment (SPMD) project.
McEneaney, William M; Dower, Peter M.
(, Proceedings of the American Control Conference)
By exploiting min-plus linearity, semiconcavity, and semigroup properties of dynamic programming, a fundamental solution semigroup for a class of approximate finite horizon linear infinite dimensional optimal control problems is constructed. Elements of this fundamental solution semigroup are parameterized by the time horizon, and can be used to approximate the solution of the corresponding finite horizon optimal control problem for any terminal cost. They can also be composed to compute approximations on longer horizons. The value function approximation provided takes the form of a min-plus convolution of a kernel with the terminal cost. A general construction for this kernel is provided, along with a spectral representation for a restricted class of sub-problems.
Luo, Yuwei; Gupta, Varun; Kolar, Mladen
(, Proceedings of the ACM on Measurement and Analysis of Computing Systems)
We consider the problem of controlling a Linear Quadratic Regulator (LQR) system over a finite horizon T with fixed and known cost matrices Q,R, but unknown and non-stationary dynamics A_t, B_t. The sequence of dynamics matrices can be arbitrary, but with a total variation, V_T, assumed to be o(T) and unknown to the controller. Under the assumption that a sequence of stabilizing, but potentially sub-optimal controllers is available for all t, we present an algorithm that achieves the optimal dynamic regret of O(V_T^2/5 T^3/5 ). With piecewise constant dynamics, our algorithm achieves the optimal regret of O(sqrtST ) where S is the number of switches. The crux of our algorithm is an adaptive non-stationarity detection strategy, which builds on an approach recently developed for contextual Multi-armed Bandit problems. We also argue that non-adaptive forgetting (e.g., restarting or using sliding window learning with a static window size) may not be regret optimal for the LQR problem, even when the window size is optimally tuned with the knowledge of $$V_T$$. The main technical challenge in the analysis of our algorithm is to prove that the ordinary least squares (OLS) estimator has a small bias when the parameter to be estimated is non-stationary. Our analysis also highlights that the key motif driving the regret is that the LQR problem is in spirit a bandit problem with linear feedback and locally quadratic cost. This motif is more universal than the LQR problem itself, and therefore we believe our results should find wider application.
Liu, Tong, Wang, Hong, and Jiang, Zhong-Ping. Data-Driven Optimal Control of Traffic Signals for Urban Road Networks. Retrieved from https://par.nsf.gov/biblio/10479222. Web. doi:10.1109/CDC51059.2022.9992876.
@article{osti_10479222,
place = {Country unknown/Code not available},
title = {Data-Driven Optimal Control of Traffic Signals for Urban Road Networks},
url = {https://par.nsf.gov/biblio/10479222},
DOI = {10.1109/CDC51059.2022.9992876},
abstractNote = {This paper studies the issue of data-driven optimal control design for traffic signals of oversaturated urban road networks. The signal control system based on the store and forward model is generally uncontrollable for which the controllable decomposition is needed. Instead of identifying the unknown parameters like saturation rates and turning ratios, a finite number of measured trajectories can be used to parametrize the system and help directly construct a transformation matrix for Kalman controllable decomposition through the fundamental lemma of J. C. Willems. On top of that, an infinite-horizon linear quadratic regulator (LQR) problem is formulated considering the constraints of green times for traffic signals. The problem can be solved through a two-phase data-driven learning process, where one solves an infinite-horizon unconstrained LQR problem and the other solves a finite-horizon constrained LQR problem. The simulation result shows the theoretical analysis is effective and the proposed data-driven controller can yield desired performance for reducing traffic congestion.},
journal = {},
publisher = {IEEE},
author = {Liu, Tong and Wang, Hong and Jiang, Zhong-Ping},
}
Warning: Leaving National Science Foundation Website
You are now leaving the National Science Foundation website to go to a non-government website.
Website:
NSF takes no responsibility for and exercises no control over the views expressed or the accuracy of
the information contained on this site. Also be aware that NSF's privacy policy does not apply to this site.