Numerous solutions are proposed for the Traffic Signal Control (TSC) tasks aiming to provide efficient transportation and alleviate traffic congestion. Recently, promising results have been attained by Reinforcement Learning (RL) methods through trial and error in simulators, bringing confidence in solving cities' congestion problems. However, performance gaps still exist when simulator-trained policies are deployed to the real world. This issue is mainly introduced by the system dynamic difference between the training simulators and the real-world environments. In this work, we leverage the knowledge of Large Language Models (LLMs) to understand and profile the system dynamics by a prompt-based grounded action transformation to bridge the performance gap. Specifically, this paper exploits the pre-trained LLM's inference ability to understand how traffic dynamics change with weather conditions, traffic states, and road types. Being aware of the changes, the policies' action is taken and grounded based on realistic dynamics, thus helping the agent learn a more realistic policy. We conduct experiments on four different scenarios to show the effectiveness of the proposed PromptGAT's ability to mitigate the performance gap of reinforcement learning from simulation to reality (sim-to-real). 
                        more » 
                        « less   
                    This content will become publicly available on March 31, 2026
                            
                            Constrained Reinforcement Learning for Fair and Environmentally Efficient Traffic Signal Controllers
                        
                    
    
            Traffic signal controller (TSC) has a crucial role in managing traffic flow in urban areas. Recently, reinforcement learning (RL) models have received a great attention for TSC with promising results. However, these RL-TSC models still need to be improved for real-world deployment due to limited exploration of different performance metrics such as fair traffic scheduling or air quality impact. In this work, we introduce a constrained multi-objective RL model that minimizes multiple constrained objectives while achieving a higher expected reward. Furthermore, our proposed RL strategy integrates the peak and average constraint models to the RL problem formulation with maximum entropy off-policy models. We applied this strategy to a single TSC and a network of TSCs. As part of this constrained RL-TSC formulation, we discuss fairness and air quality parameters as constraints for the closed-loop control system optimization model at TSCs calledFAirLight. Our experimental analysis shows that the proposedFAirLightachieves a good traffic flow performance in terms of average waiting time while being fair and environmentally friendly. Our method outperforms the baseline models and allows a more comprehensive view of RL-TSC regarding its applicability to the real world. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1934568
- PAR ID:
- 10555548
- Publisher / Repository:
- ACM
- Date Published:
- Journal Name:
- ACM Journal on Autonomous Transportation Systems
- Volume:
- 2
- Issue:
- 1
- ISSN:
- 2833-0528
- Page Range / eLocation ID:
- 1 to 19
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Due to repetitive trial-and-error style interactions between agents and a fixed traffic environment during the policy learning, existing Reinforcement Learning (RL)-based Traffic Signal Control (TSC) methods greatly suffer from long RL training time and poor adaptability of RL agents to other complex traffic environments. To address these problems, we propose a novel Adversarial Inverse Reinforcement Learning (AIRL)-based pre-training method named InitLight, which enables effective initial model generation for TSC agents. Unlike traditional RL-based TSC approaches that train a large number of agents simultaneously for a specific multi-intersection environment, InitLight pretrains only one single initial model based on multiple single-intersection environments together with their expert trajectories. Since the reward function learned by InitLight can recover ground-truth TSC rewards for different intersections at optimality, the pre-trained agent can be deployed at intersections of any traffic environments as initial models to accelerate subsequent overall global RL training. Comprehensive experimental results show that, the initial model generated by InitLight can not only significantly accelerate the convergence with much fewer episodes, but also own superior generalization ability to accommodate various kinds of complex traffic environments.more » « less
- 
            Abstract Urban air mobility (UAM) is an emerging air transportation mode to alleviate the ground traffic burden and achieve zero direct aviation emissions. Due to the potential economic scaling effects, the UAM traffic flow is expected to increase dramatically once implemented, and its market can be substantially large. To be prepared for the era of UAM, we study the fair and risk‐averse urban air mobility resource allocation model (FairUAM) under passenger demand and airspace capacity uncertainties for fair, safe, and efficient aircraft operations. FairUAM is a two‐stage model, where the first stage is the aircraft resource allocation, and the second stage is to fairly and efficiently assign the ground and airspace delays to each aircraft provided the realization of random airspace capacities and passenger demand. We show that FairUAM is NP‐hard even when there is no delay assignment decision or no aircraft allocation decision. Thus, we recast FairUAM as a mixed‐integer linear program (MILP) and explore model properties and strengthen the model formulation by developing multiple families of valid inequalities. The stronger formulation allows us to develop a customized exact decomposition algorithm with both benders and L‐shaped cuts, which significantly outperforms the off‐the‐shelf solvers. Finally, we numerically demonstrate the effectiveness of the proposed method and draw managerial insights when applying FairUAM to a real‐world network.more » « less
- 
            Eco-driving has garnered considerable research attention owing to its potential socio-economic impact, including enhanced public health and mitigated climate change effects through the reduction of greenhouse gas emissions. With an expectation of more autonomous vehicles (AVs) on the road, an eco-driving strategy in hybrid traffic networks encompassing AV and human-driven vehicles (HDVs) with the coordination of traffic lights is a challenging task. The challenge is partially due to the insufficient infrastructure for collecting, transmitting, and sharing real-time traffic data among vehicles, facilities, and traffic control centers, and the following decision-making of agents involved in traffic control. Additionally, the intricate nature of the existing traffic network, with its diverse array of vehicles and facilities, contributes to the challenge by hindering the development of a mathematical model for accurately characterizing the traffic network. In this study, we utilized the Simulation of Urban Mobility (SUMO) simulator to tackle the first challenge through computational analysis. To address the second challenge, we employed a model-free reinforcement learning (RL) algorithm, proximal policy optimization, to decide the actions of AV and traffic light signals in a traffic network. A novel eco-driving strategy was proposed by introducing different percentages of AV into the traffic flow and collaborating with traffic light signals using RL to control the overall speed of the vehicles, resulting in improved fuel consumption efficiency. Average rewards with different penetration rates of AV (5%, 10%, and 20% of total vehicles) were compared to the situation without any AV in the traffic flow (0% penetration rate). The 10% penetration rate of AV showed a minimum time of convergence to achieve average reward, leading to a significant reduction in fuel consumption and total delay of all vehicles.more » « less
- 
            Traffic congestion has long been an ubiquitous problem that is exacerbating with the rapid growth of megac- ities. In this proof-of-concept work we study intrinsic motiva- tion, implemented via the empowerment principle, to control autonomous car behavior to improve traffic flow. In standard models of traffic dynamics, self-organized traffic jams emerge spontaneously from the individual behavior of cars, affecting traffic over long distances. Our novel car behavior strategy improves traffic flow while still being decentralized and using only locally available information without explicit coordination. Decentralization is essential for various reasons, not least to be able to absorb robustly substantial levels of uncertainty. Our scenario is based on the well-established traffic dynamics model, the Nagel-Schreckenberg cellular automaton. In a fraction of the cars in this model, we substitute the default behavior by empowerment, our intrinsic motivation-based method. This proposed model significantly improves overall traffic flow, mitigates congestion, and reduces the average traffic jam time.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
