Multi-objective or multi-destination path planning is crucial for mobile robotics applications such as mobility as a service, robotics inspection, and electric vehicle charging for long trips. This work proposes an anytime iterative system to concurrently solve the multi-objective path planning problem and determine the visiting order of destinations. The system is comprised of an anytime informable multi-objective and multi-directional RRT * algorithm to form a simple connected graph, and a solver that consists of an enhanced cheapest insertion algorithm and a genetic algorithm to solve approximately the relaxed traveling salesman problem in polynomial time. Moreover, a list of waypoints is often provided for robotics inspection and vehicle routing so that the robot can preferentially visit certain equipment or areas of interest. We show that the proposed system can inherently incorporate such knowledge to navigate challenging topology. The proposed anytime system is evaluated on large and complex graphs built for real-world driving applications. C++ implementations are available at: https://github.com/UMich-BipedLab/IMOMD-RRTStar.
more »
« less
Effect of Routing Constraints\! on Learning Efficiency of Destination Recommender Systems in Mobility-on-Demand Services
With Mobility-as-a-Service platforms moving toward vertical service expansion, we propose a destination recommender system for Mobility-on-Demand (MOD) services that explicitly considers dynamic vehicle routing constraints as a form of a ``physical internet search engine''. It incorporates a routing algorithm to build vehicle routes and an upper confidence bound based algorithm for a generalized linear contextual bandit algorithm to identify alternatives which are acceptable to passengers. As a contextual bandit algorithm, the added context from the routing subproblem makes it unclear how effective learning is under such circumstances. We propose a new simulation experimental framework to evaluate the impact of adding the routing constraints to the destination recommender algorithm. The proposed algorithm is first tested on a 7 by 7 grid network and performs better than benchmarks that include random alternatives, selecting the highest rating, or selecting the destination with the smallest vehicle routing cost increase. The RecoMOD algorithm also reduces average increases in vehicle travel costs compared to using random or highest rating recommendation. Its application to Manhattan dataset with ratings for 1,012 destinations reveals that a higher customer arrival rate and faster vehicle speeds lead to better acceptance rates. While these two results sound contradictory, they provide important managerial insights for MOD operators.
more »
« less
- Award ID(s):
- 1652735
- PAR ID:
- 10213710
- Date Published:
- Journal Name:
- IEEE Transactions on Intelligent Transportation Systems
- ISSN:
- 1524-9050
- Page Range / eLocation ID:
- 1 to 16
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Vehicle routing problems (VRPs) can be divided into two major categories: offline VRPs, which consider a given set of trip requests to be served, and online VRPs, which consider requests as they arrive in real-time. Based on discussions with public transit agencies, we identify a real-world problem that is not addressed by existing formulations: booking trips with flexible pickup windows (e.g., 3 hours) in advance (e.g., the day before) and confirming tight pickup windows (e.g., 30 minutes) at the time of booking. Such a service model is often required in paratransit service settings, where passengers typically book trips for the next day over the phone. To address this gap between offline and online problems, we introduce a novel formulation, the offline vehicle routing problem with online bookings. This problem is very challenging computationally since it faces the complexity of considering large sets of requests—similar to offline VRPs—but must abide by strict constraints on running time—similar to online VRPs. To solve this problem, we propose a novel computational approach, which combines an anytime algorithm with a learning-based policy for real-time decisions. Based on a paratransit dataset obtained from our partner transit agency, we demonstrate that our novel formulation and computational approach lead to significantly better outcomes in this service setting than existing algorithms.more » « less
-
Multi-armed bandit algorithms have become a reference solution for handling the explore/exploit dilemma in recommender systems, and many other important real-world problems, such as display advertisement. However, such algorithms usually assume a stationary reward distribution, which hardly holds in practice as users' preferences are dynamic. This inevitably costs a recommender system consistent suboptimal performance. In this paper, we consider the situation where the underlying distribution of reward remains unchanged over (possibly short) epochs and shifts at unknown time instants. In accordance, we propose a contextual bandit algorithm that detects possible changes of environment based on its reward estimation confidence and updates its arm selection strategy respectively. Rigorous upper regret bound analysis of the proposed algorithm demonstrates its learning effectiveness in such a non-trivial environment. Extensive empirical evaluations on both synthetic and real-world datasets for recommendation confirm its practical utility in a changing environment.more » « less
-
Latent factor models have become a prevalent method in recommender systems, to predict users' preference on items based on the historical user feedback. Most of the existing methods, explicitly or implicitly, are built upon the first-order rating distance principle, which aims to minimize the difference between the estimated and real ratings. In this paper, we generalize such first-order rating distance principle and propose a new latent factor model (HoORaYs) for recommender systems. The core idea of the proposed method is to explore high-order rating distance, which aims to minimize not only (i) the difference between the estimated and real ratings of the same (user, item) pair (i.e., the first-order rating distance), but also (ii) the difference between the estimated and real rating difference of the same user across different items (i.e., the second-order rating distance). We formulate it as a regularized optimization problem, and propose an effective and scalable algorithm to solve it. Our analysis from the geometry and Bayesian perspectives indicate that by exploring the high-order rating distance, it helps to reduce the variance of the estimator, which in turns leads to better generalization performance (e.g., smaller prediction error). We evaluate the proposed method on four real-world data sets, two with explicit user feedback and the other two with implicit user feedback. Experimental results show that the proposed method consistently outperforms the state-of-the-art methods in terms of the prediction accuracy.more » « less
-
Recommender systems traditionally find the most relevant products or services for users tailored to their needs or interests but they ignore the interests of the other sides of the market (aka stakeholders). In this paper, we propose to use a Ranked Bandit approach for an online multi-stakeholder recommender system that sequentially selects top 𝑘 items according to the relevance and priority of all the involved stakeholders. We presented three different criteria to consider the priority of each stakeholder when evaluating our approach. Our extensive experimental results on a movie dataset showed that the contextual multi-armed bandits with a relevance function make a higher level of satisfaction for all involved stakeholders in the long term. Keywords: Multi-stakeholder Recommender Systems; Multi-armed Bandits; Ranked Bandit;more » « less