Reinforcement Learning from Optimization Proxy for Ride-Hailing Vehicle Relocation

Yuan, Enpeng; Chen, Wenbo; Van Hentenryck, Pascal

doi:10.1613/jair.1.13794

Citation Details

Reinforcement Learning from Optimization Proxy for Ride-Hailing Vehicle Relocation

Idle vehicle relocation is crucial for addressing demand-supply imbalance that frequently arises in the ride-hailing system. Current mainstream methodologies - optimization and reinforcement learning - suffer from obvious computational drawbacks. Optimization models need to be solved in real-time and often trade off model fidelity (hence quality of solutions) for computational efficiency. Reinforcement learning is expensive to train and often struggles to achieve coordination among a large fleet. This paper designs a hybrid approach that leverages the strengths of the two while overcoming their drawbacks. Specifically, it trains an optimization proxy, i.e., a machine-learning model that approximates an optimization model, and then refines the proxy with reinforcement learning. This Reinforcement Learning from Optimization Proxy (RLOP) approach is computationally efficient to train and deploy, and achieves better results than RL or optimization alone. Numerical experiments on the New York City dataset show that the RLOP approach reduces both the relocation costs and computation time significantly compared to the optimization model, while pure reinforcement learning fails to converge due to computational complexity. more »

Award ID(s):: 1854684

PAR ID:: 10416446

Author(s) / Creator(s):: Yuan, Enpeng; Chen, Wenbo; Van Hentenryck, Pascal

Date Published:: 2022-09-19

Journal Name:: Journal of Artificial Intelligence Research

Volume:: 75

ISSN:: 1076-9757

Page Range / eLocation ID:: 985 to 1002

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1613/jair.1.13794

More Like this