skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on August 3, 2026

Title: Multi-Agent Reinforcement Learning for Dynamic Mobility Resource Allocation with Hierarchical Adaptive Grouping
Allocating mobility resources (e.g., shared bikes/e-scooters, ridesharing vehicles) is crucial for rebalancing the mobility demand and supply in the urban environments. We propose in this work a novel multi-agent reinforcement learning named Hierarchical Adaptive Grouping-based Parameter Sharing (HAG-PS) for dynamic mobility resource allocation. HAG-PS aims to address two important research challenges regarding multi-agent reinforcement learning for mobility resource allocation: (1) how to dynamically and adaptively share the mobility resource allocation policy (i.e., how to distribute mobility resources) across agents (i.e., representing the regional coordinators of mobility resources); and (2) how to achieve memory-efficient parameter sharing in an urban-scale setting. To address the above challenges, we have provided following novel designs within HAG-PS. To enable dynamic and adaptive parameter sharing, we have designed a hierarchical approach that consists of global and local information of the mobility resource states (e.g., distribution of mobility resources). We have developed an adaptive agent grouping approach in order to split or merge the groups of agents based on their relative closeness of encoded trajectories (i.e., states, actions, and rewards). We have designed a learnable identity (ID) embeddings to enable agent specialization beyond simple parameter copy. We have performed extensive experimental studies based on real-world NYC bike sharing data (a total of more than 1.2 million trips), and demonstrated the superior performance (e.g., improved bike availability) of HAG-PS compared with other baseline approaches.  more » « less
Award ID(s):
2303575
PAR ID:
10620964
Author(s) / Creator(s):
;
Publisher / Repository:
The 14th International Workshop on Urban Computing
Date Published:
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Allocating mobility resources (e.g., shared bikes/e-scooters, ridesharing vehicles) is crucial for rebalancing the mobility demand and supply in the urban environments. We propose in this work a novel multi-agent reinforcement learning named Hierarchical Adaptive Grouping-based Parameter Sharing (HAG-PS) for dynamic mobility resource allocation. HAG-PS aims to address two important research challenges regarding multi-agent reinforcement learning for mobility resource allocation: (1) how to dynamically and adaptively share the mobility resource allocation policy (i.e., how to distribute mobility resources) across agents (i.e., representing the regional coordinators of mobility resources); and (2) how to achieve memory-efficient parameter sharing in an urban-scale setting. To address the above challenges, we have provided following novel designs within HAG-PS. To enable dynamic and adaptive parameter sharing, we have designed a hierarchical approach that consists of global and local information of the mobility resource states (e.g., distribution of mobility resources). We have developed an adaptive agent grouping approach in order to split or merge the groups of agents based on their relative closeness of encoded trajectories (i.e., states, actions, and rewards). We have designed a learnable identity (ID) embeddings to enable agent specialization beyond simple parameter copy. We have performed extensive experimental studies based on real-world NYC bike sharing data (a total of more than 1.2 million trips), and demonstrated the superior performance (e.g., improved bike availability) of HAG-PS compared with other baseline approaches. 
    more » « less
  2. null (Ed.)
    With the popularity of the Internet, traditional offline resource allocation has evolved into a new form, called online resource allocation. It features the online arrivals of agents in the system and the real-time decision-making requirement upon the arrival of each online agent. Both offline and online resource allocation have wide applications in various real-world matching markets ranging from ridesharing to crowdsourcing. There are some emerging applications such as rebalancing in bike sharing and trip-vehicle dispatching in ridesharing, which involve a two-stage resource allocation process. The process consists of an offline phase and another sequential online phase, and both phases compete for the same set of resources. In this paper, we propose a unified model which incorporates both offline and online resource allocation into a single framework. Our model assumes non-uniform and known arrival distributions for online agents in the second online phase, which can be learned from historical data. We propose a parameterized linear programming (LP)-based algorithm, which is shown to be at most a constant factor of 1/4 from the optimal. Experimental results on the real dataset show that our LP-based approaches outperform the LP-agnostic heuristics in terms of robustness and effectiveness. 
    more » « less
  3. Understanding the spatial dynamics of bike-sharing usage is critical for effective urban planning and mobility resource management. In this study, we propose an interpretable deep learning approach to uncover spatial relationships embedded in bike-sharing activities. Specifically, we develop a spatially-adapted SHapley Additive exPlanations (SHAP)-based method to quantify the spatial dependencies between locations in bike-sharing activities and apply it to interpret the predictions of a bike-sharing model. Extensive experiments upon Citi Bike data from New York City in December 2023 reveal that spatial influence does not strictly follow geographic proximity and is anisotropic. Additionally, non-member users exhibit weaker spatial dependencies in their bike usage behavior, resulting in lower short-term predictability compared to member users. Our studies shed deep insights into the spatial dynamics of bike-sharing systems and provide guidance for more effective service deployment and system design. 
    more » « less
  4. null (Ed.)
    Bike sharing systems have been in place for several years in many urban areas as alternative and sustainable means of transportation. Bicycle usage heavily depends on the available infrastructure (e.g., protected bike lanes), but other—mutable or immutable—environmental characteristics of a city can influence the adoption of the system from its dwellers. Hence, it is important to understand how these factors influence people’s decisions of whether to use a bike system or not. In this this paper, we first investigate how altitude variation influences the usage of the bike sharing system in Pittsburgh. Using trip data from the system, and controlling fora number of other potential confounding factors, we formulate the problem as a classification problem, develop a framework to enable prediction using Poisson regression, and find that there is a negative correlation between the altitude difference and the number of trips between two stations (fewer trips between stations with larger altitude difference). We further, discuss how the results of our analysis can be used to inform decision making during the design and operation of bike sharing systems. 
    more » « less
  5. How can non-communicating agents learn to share congested resources efficiently? This is a challeng- ing task when the agents can access the same resource simultaneously (in contrast to multi-agent multi-armed bandit problems) and the resource valuations differ among agents. We present a fully distributed algorithm for learning to share in congested environments and prove that the agents’ regret with respect to the optimal allocation is poly-logarithmic in the time horizon. Performance in the non-asymptotic regime is illustrated in numerical simulations. The distributed algorithm has applications in cloud computing and spectrum sharing. 
    more » « less