Multi-Agent Path Finding (MAPF), i.e., finding collision-free paths for multiple robots, is important for many applications where small runtimes are necessary, including the kind of automated warehouses operated by Amazon. CBS is a lead- ing two-level search algorithm for solving MAPF optimally. ECBS is a bounded-suboptimal variant of CBS that uses focal search to speed up CBS by sacrificing optimality and instead guaranteeing that the costs of its solutions are within a given factor of optimal. In this paper, we study how to decrease its runtime even further using inadmissible heuristics. Motivated by Explicit Estimation Search (EES), we propose Explicit Estimation CBS (EECBS), a new bounded-suboptimal variant of CBS, that uses online learning to obtain inadmissible estimates of the cost of the solution of each high-level node and uses EES to choose which high-level node to expand next. We also investigate recent improvements of CBS and adapt them to EECBS. We find that EECBS with the improvements runs significantly faster than the state-of-the-art bounded-suboptimal MAPF algorithms ECBS, BCP-7, and eMDD-SAT on a variety of MAPF instances. We hope that the scalability of EECBS enables additional applications for bounded-suboptimal MAPF algorithms.
Safe Multi-Agent Pathfinding with Time Uncertainty
In many real-world scenarios, the time it takes for a mobile agent, e.g., a robot, to move from one location to another may vary due to exogenous events and be difficult to predict accurately. Planning in such scenarios is challenging, especially in the context of Multi-Agent Pathfinding (MAPF), where the goal is to find paths to multiple agents and temporal coordination is necessary to avoid collisions. In this work, we consider a MAPF problem with this form of time uncertainty, where we are only given upper and lower bounds on the time it takes each agent to move. The objective is to find a safe solution, which is a solution that can be executed by all agents and is guaranteed to avoid collisions. We propose two complete and optimal algorithms for finding safe solutions based on well-known MAPF algorithms, namely, A* with Operator Decomposition (A* + OD) and Conflict-Based Search (CBS). Experimentally, we observe that on several standard MAPF grids the CBS-based algorithm performs better. We also explore the option of online replanning in this context, i.e., modifying the agents' plans during execution, to reduce the overall execution cost. We consider two online settings: (a) when an agent can sense more »
- Award ID(s):
- 1908287
- Publication Date:
- NSF-PAR ID:
- 10297272
- Journal Name:
- Journal of Artificial Intelligence Research
- Volume:
- 70
- ISSN:
- 1076-9757
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Multi-Agent Path Finding (MAPF) is the combinatorial problem of finding collision-free paths for multiple agents on a graph. This paper describes MAPF-based software for solving train planning and replanning problems on large-scale rail networks under uncertainty. The software recently won the 2020 Flatland Challenge, a NeurIPS competition trying to determine how to efficiently manage dense traffic on rail networks. The software incorporates many state-of-the-art MAPF or, in general, optimization technologies, such as prioritized planning, large neighborhood search, safe interval path planning, minimum communication policies, parallel computing, and simulated annealing. It can plan collision-free paths for thousands of trains within a few minutes and deliver deadlock-free actions in real-time during execution.
-
In a multi-agent path finding (MAPF) problem, the task is to move a set of agents to their goal locations without conflicts. In the real world, unexpected events may delay some of the agents. In this paper, we therefore study the problem of finding a p-robust solution to a given MAPF problem, which is a solution that succeeds with probability at least p, even though unexpected delays may occur. We propose two methods for verifying that given solutions are p-robust. We also introduce an optimal CBS-based algorithm, called pR-CBS, and a fast suboptimal algorithm, called pR-GCBS, for finding such solutions. Our experiments show that a p-robust solution reduces the number of conflicts compared to optimal, non-robust solutions.
-
In a multi-agent path finding (MAPF) problem, the task is to move a set of agents to their goal locations without conflicts. In the real world, unexpected events may delay some of the agents. In this paper, we therefore study the problem of finding a p-robust solution to a given MAPF problem, which is a solution that succeeds with probability at least p, even though unexpected delays may occur. We propose two methods for verifying that given solutions are p-robust. We also introduce an optimal CBS-based algorithm, called pR-CBS, and a fast suboptimal algorithm, called pR-GCBS, for finding such solutions. Our experiments show that a p-robust solution reduces the number of conflicts compared to optimal, non-robust solutions.
-
We propose and evaluate a learning-based framework to address multi-agent resource allocation in coupled wireless systems. In particular we consider, multiple agents (e.g., base stations, access points, etc.) that choose amongst a set of resource allocation options towards achieving their own performance objective /requirements, and where the performance observed at each agent is further coupled with the actions chosen by the other agents, e.g., through interference, channel leakage, etc. The challenge is to find the best collective action. To that end we propose a Multi-Armed Bandit (MAB) framework wherein the best actions (aka arms) are adaptively learned through online reward feedback. Our focus is on systems which are "weakly-coupled" wherein the best arm of each agent is invariant to others' arm selection the majority of the time - this majority structure enables one to develop light weight efficient algorithms. This structure is commonly found in many wireless settings such as channel selection and power control. We develop a bandit algorithm based on the Track-and-Stop strategy, which shows a logarithmic regret with respect to a genie. Finally through simulation, we exhibit the potential use of our model and algorithm in several wireless application scenarios.