Numerous solutions are proposed for the Traffic Signal Control (TSC) tasks aiming to provide efficient transportation and alleviate traffic congestion. Recently, promising results have been attained by Reinforcement Learning (RL) methods through trial and error in simulators, bringing confidence in solving cities' congestion problems. However, performance gaps still exist when simulator-trained policies are deployed to the real world. This issue is mainly introduced by the system dynamic difference between the training simulators and the real-world environments. In this work, we leverage the knowledge of Large Language Models (LLMs) to understand and profile the system dynamics by a prompt-based grounded action transformation to bridge the performance gap. Specifically, this paper exploits the pre-trained LLM's inference ability to understand how traffic dynamics change with weather conditions, traffic states, and road types. Being aware of the changes, the policies' action is taken and grounded based on realistic dynamics, thus helping the agent learn a more realistic policy. We conduct experiments on four different scenarios to show the effectiveness of the proposed PromptGAT's ability to mitigate the performance gap of reinforcement learning from simulation to reality (sim-to-real).
This content will become publicly available on March 31, 2026
Traffic signal controller (TSC) has a crucial role in managing traffic flow in urban areas. Recently, reinforcement learning (RL) models have received a great attention for TSC with promising results. However, these RL-TSC models still need to be improved for real-world deployment due to limited exploration of different performance metrics such as fair traffic scheduling or air quality impact. In this work, we introduce a constrained multi-objective RL model that minimizes multiple constrained objectives while achieving a higher expected reward. Furthermore, our proposed RL strategy integrates the peak and average constraint models to the RL problem formulation with maximum entropy off-policy models. We applied this strategy to a single TSC and a network of TSCs. As part of this constrained RL-TSC formulation, we discuss fairness and air quality parameters as constraints for the closed-loop control system optimization model at TSCs called
- Award ID(s):
- 1934568
- PAR ID:
- 10555548
- Publisher / Repository:
- ACM
- Date Published:
- Journal Name:
- ACM Journal on Autonomous Transportation Systems
- Volume:
- 2
- Issue:
- 1
- ISSN:
- 2833-0528
- Page Range / eLocation ID:
- 1 to 19
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Due to repetitive trial-and-error style interactions between agents and a fixed traffic environment during the policy learning, existing Reinforcement Learning (RL)-based Traffic Signal Control (TSC) methods greatly suffer from long RL training time and poor adaptability of RL agents to other complex traffic environments. To address these problems, we propose a novel Adversarial Inverse Reinforcement Learning (AIRL)-based pre-training method named InitLight, which enables effective initial model generation for TSC agents. Unlike traditional RL-based TSC approaches that train a large number of agents simultaneously for a specific multi-intersection environment, InitLight pretrains only one single initial model based on multiple single-intersection environments together with their expert trajectories. Since the reward function learned by InitLight can recover ground-truth TSC rewards for different intersections at optimality, the pre-trained agent can be deployed at intersections of any traffic environments as initial models to accelerate subsequent overall global RL training. Comprehensive experimental results show that, the initial model generated by InitLight can not only significantly accelerate the convergence with much fewer episodes, but also own superior generalization ability to accommodate various kinds of complex traffic environments.more » « less
-
Abstract Urban air mobility (UAM) is an emerging air transportation mode to alleviate the ground traffic burden and achieve zero direct aviation emissions. Due to the potential economic scaling effects, the UAM traffic flow is expected to increase dramatically once implemented, and its market can be substantially large. To be prepared for the era of UAM, we study the fair and risk‐averse urban air mobility resource allocation model (FairUAM) under passenger demand and airspace capacity uncertainties for fair, safe, and efficient aircraft operations. FairUAM is a two‐stage model, where the first stage is the aircraft resource allocation, and the second stage is to fairly and efficiently assign the ground and airspace delays to each aircraft provided the realization of random airspace capacities and passenger demand. We show that FairUAM is NP‐hard even when there is no delay assignment decision or no aircraft allocation decision. Thus, we recast FairUAM as a mixed‐integer linear program (MILP) and explore model properties and strengthen the model formulation by developing multiple families of valid inequalities. The stronger formulation allows us to develop a customized exact decomposition algorithm with both benders and L‐shaped cuts, which significantly outperforms the off‐the‐shelf solvers. Finally, we numerically demonstrate the effectiveness of the proposed method and draw managerial insights when applying FairUAM to a real‐world network.
-
In the urban corridor with a mixed traffic composition of connected and automated vehicles (CAVs) alongside human-driven vehicles (HDVs), vehicle operations are intricately influenced by both individual driving behaviors and the presence of signalized intersections. Therefore, the development of a coordinated control strategy that effectively accommodates these dual factors becomes imperative to enhance the overall quality of traffic flow. This study proposes a bi-level structure crafted to decouple the joint effects of the vehicular driving behaviors and corridor signal offsets setting. The objective of this structure is to optimize both the average travel time (ATT) and fuel consumption (AFC). At the lower-level, three types of car-following models while considering driving modes are presented to illustrate the desired driving behaviors of HDVs and CAVs. Moreover, a trigonometry function method combined with a rolling horizon scheme is proposed to generate the eco-trajectory of CAVs in the mixed traffic flow. At the upper-level, a multi-objective optimization model for corridor signal offsets is formulated to minimize ATT and AFC based on the lower-level simulation outputs. Additionally, a revised Non-Dominated Sorting Genetic Algorithm II (NSGA-II) is adopted to identify the set of Pareto-optimal solutions for corridor signal offsets under different CAV penetration rates (CAV PRs). Numerical experiments are conducted within a corridor that encompasses three signalized intersections. The performance of our proposed eco-driving strategy is validated in comparison to the intelligent driver model (IDM) and green light optimal speed advisory (GLOSA) algorithm in single-vehicle simulation. Results show that our proposed strategy yields reduced travel time and fuel consumption to both IDM and GLOSA. Subsequently, the effectiveness of our proposed coordinated control strategy is validated across various CAV PRs. Results indicated that the optimal AFC can be reduced by 4.1%–32.2% with CAV PRs varying from 0.2 to 1, and the optimal ATT can be saved by 2.3% maximum. Furthermore, sensitivity analysis is conducted to evaluate the impact of CAV PRs and V/C ratios on the optimal ATT and AFC.more » « less
-
Connected vehicle (CV) technology brings both opportunities and challenges to the traffic signal control (TSC) system. While safety and mobility performance could be greatly improved by adopting CV technologies, the connectivity between vehicles and transportation infrastructure may increase the risks of cyber threats. In the past few years, studies related to cybersecurity on the TSC systems were conducted. However, there still lacks a systematic investigation that provides a comprehensive analysis framework. In this study, our aim is to fill the research gap by proposing a comprehensive analysis framework for the cybersecurity problem of the TSC in the CV environment. With potential threats towards the major components of the system and their corresponding impacts on safety and efficiency analyzed, data spoofing attack is considered the most plausible and realistic attack approach. Based on this finding, different attack strategies and defense solutions are discussed. A case study is presented to show the impact of the data spoofing attacks towards a selected CV based TSC system and corresponding mitigation countermeasures. This case study is conducted on a hybrid security testing platform, with virtual traffic and a real V2X communication network. To the best of our knowledge, this is the first study to present a comprehensive analysis framework to the cybersecurity problem of the CV-based TSC systems.more » « less