skip to main content


Title: SMART-eFlo: An Integrated SUMO-Gym Framework for Multi-Agent Reinforcement Learning in Electric Fleet Management Problem
Electric vehicles (EVs) have been used in the ride-hailing system in recent years, which brings the electric fleet management problem (EFMP) critical. This paper aims to leverage multi-agent reinforcement learning (MARL) in EFMP. In particular, we focus on how EVs learn to manage battery charging, pick up and drop off passengers. We propose an integrated SUMO-Gym framework based on the SUMO simulator to capture EVs’ asynchronous decisionmaking regarding charging and ride-hailing in complex traffic environments. We adopt a hierarchical reinforcement learning (HRL) scheme, where each EV decides to get charged or pick up a passenger on the upper level and chooses a charging station or passenger on the lower level. We develop a learning algorithm for the HRL scheme to solve EFMP and present numerical results about the efficiency of our algorithm and policies EVs have learned in EFMP. Our codes are available at https://github.com/LovelyBuggies/SUMO-Gym, which provides an open-source environment for researchers to design traffic scenarios and test RL algorithms for EFMP.  more » « less
Award ID(s):
2038984
NSF-PAR ID:
10447569
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Advances in information technologies and vehicle automation have birthed new transportation services, including shared autonomous vehicles (SAVs). Shared autonomous vehicles are on-demand self-driving taxis, with flexible routes and schedules, able to replace personal vehicles for many trips in the near future. The siting and density of pick-up and drop-off (PUDO) points for SAVs, much like bus stops, can be key in planning SAV fleet operations, since PUDOs impact SAV demand, route choices, passenger wait times, and network congestion. Unlike traditional human-driven taxis and ride-hailing vehicles like Lyft and Uber, SAVs are unlikely to engage in quasi-legal procedures, like double parking or fire hydrant pick-ups. In congested settings, like central business districts (CBD) or airport curbs, SAVs and others will not be allowed to pick up and drop off passengers wherever they like. This paper uses an agent-based simulation to model the impact of different PUDO locations and densities in the Austin, Texas CBD, where land values are highest and curb spaces are coveted. In this paper 18 scenarios were tested, varying PUDO density, fleet size and fare price. The results show that for a given fare price and fleet size, PUDO spacing (e.g., one block vs. three blocks) has significant impact on ridership, vehicle-miles travelled, vehicle occupancy, and revenue. A good fleet size to serve the region’s 80 core square miles is 4000 SAVs, charging a $1 fare per mile of travel distance, and with PUDOs spaced three blocks of distance apart from each other in the CBD.

     
    more » « less
  2. Curb space is one of the busiest areas in urban road networks. Especially in recent years, the rapid increase of ride-hailing trips and commercial deliveries has induced massive pick-ups/drop-offs (PUDOs), which occupy the limited curb space that was designed and built decades ago. These PUDOs could jam curbside utilization and disturb the mainline traffic flow, evidently leading to significant negative societal externalities. However, there is a lack of an analytical framework that rigorously quantifies and mitigates the congestion effect of PUDOs in the system view, particularly with little data support and involvement of confounding effects. To bridge this research gap, this paper develops a rigorous causal inference approach to estimate the congestion effect of PUDOs on general regional networks. A causal graph is set to represent the spatiotemporal relationship between PUDOs and traffic speed, and a double and separated machine learning (DSML) method is proposed to quantify how PUDOs affect traffic congestion. Additionally, a rerouting formulation is developed and solved to encourage passenger walking and traffic flow rerouting to achieve system optimization. Numerical experiments are conducted using real-world data in the Manhattan area. On average, 100 additional units of PUDOs in a region could reduce the traffic speed by 3.70 and 4.54 miles/hour (mph) on weekdays and weekends, respectively. Rerouting trips with PUDOs on curb space could respectively reduce the system-wide total travel time (TTT) by 2.44% and 2.12% in Midtown and Central Park on weekdays. A sensitivity analysis is also conducted to demonstrate the effectiveness and robustness of the proposed framework.

    Funding: The work described in this paper was supported by the National Natural Science Foundation of China [Grant 52102385], grants from the Research Grants Council of the Hong Kong Special Administrative Region, China [Grants PolyU/25209221 and PolyU/15206322], a grant from the Otto Poon Charitable Foundation Smart Cities Research Institute (SCRI) at the Hong Kong Polytechnic University [Grant P0043552], and a grant from Hong Kong Polytechnic University [Grant P0033933]. S. Qian was supported by a National Science Foundation Grant [Grant CMMI-1931827].

    Supplemental Material: The e-companion is available at https://doi.org/10.1287/trsc.2022.0195 .

     
    more » « less
  3. The knowledge of all occupied and unoccupied trips made by selfemployed drivers are essential for optimized vehicle dispatch by ride-hailing services (e.g., Didi Dache, Uber, Lyft, Grab, etc.). However, vehicles’ occupancy status is not always known to service operators due to adoption of multiple ride-hailing apps. In this paper, we propose a novel framework, Learning to INfer Trips (LINT), to infer occupancy of car trips by exploring characteristics of observed occupied trips. Two main research steps, stop point classification and structural segmentation, are included in LINT. In the first step, we represent a vehicle trajectory as a sequence of stop points, and assign stop points with pick-up, drop-off, and intermediate labels thus producing a stop point label sequence. In the second step, for structural segmentation, we further propose several segmentation algorithms, including greedy segmentation (GS), efficient greedy segmentation (EGS), and dynamic programming-based segmentation (DP) to infer occupied trip from stop point label sequences. Our comprehensive experiments on real vehicle trajectories from self-employed drivers show that (1) the proposed stop point classifier predicts stop point labels with high accuracy, and (2) the proposed segmentation algorithm GS delivers the best accuracy performance with efficient running time. 
    more » « less
  4. null (Ed.)
    This paper considers off-street parking for the cruising vehicles of transportation network companies (TNCs) to reduce the traffic congestion. We propose a novel business that integrates the shared parking service into the TNC platform. In the proposed model, the platform (a) provides interfaces that connect passengers, drivers and garage operators (commercial or private garages); (b) determines the ride fare, driver payment, and parking rates; (c) matches passengers to TNC vehicles for ride-hailing services; and (d) matches vacant TNC vehicles to unoccupied parking garages to reduce the cruising cost. A queuing-theoretic model is proposed to capture the matching process of passengers, drivers, and parking garages. A market-equilibrium model is developed to capture the incentives of the passengers, drivers, and garage operators. An optimization-based model is formulated to capture the optimal pricing of the TNC platform. Through a realistic case study, we show that the proposed business model will offer a Pareto improvement that benefits all stakeholders, which leads to higher passenger surplus, higher drivers surplus, higher garage operator surplus, higher platform profit, and reduced traffic congestion. 
    more » « less
  5. Urban public transit planning is crucial in reducing traffic congestion and enabling green transportation. However, there is no systematic way to integrate passengers' personal preferences in planning public transit routes and schedules so as to achieve high occupancy rates and efficiency gain of ride-sharing. In this paper, we take the first step tp exact passengers' preferences in planning from history public transit data. We propose a data-driven method to construct a Markov decision process model that characterizes the process of passengers making sequential public transit choices, in bus routes, subway lines, and transfer stops/stations. Using the model, we integrate softmax policy iteration into maximum entropy inverse reinforcement learning to infer the passenger's reward function from observed trajectory data. The inferred reward function will enable an urban planner to predict passengers' route planning decisions given some proposed transit plans, for example, opening a new bus route or subway line. Finally, we demonstrate the correctness and accuracy of our modeling and inference methods in a large-scale (three months) passenger-level public transit trajectory data from Shenzhen, China. Our method contributes to smart transportation design and human-centric urban planning. 
    more » « less