The Paretooptimal frontier for a biobjective search problem instance consists of all solutions that are not worse than any other solution in both objectives. The size of the Paretooptimal frontier can be exponential in the size of the input graph, and hence finding it can be hard. Some existing works leverage a userspecified approximation factor ε to compute an approximate Paretooptimal frontier that can be significantly smaller than the Paretooptimal frontier. In this paper, we propose an anytime approximate biobjective search algorithm, called Anytime BiObjective A*ε (ABOA*ε). ABOA*ε is useful when deliberation time is limited. It first finds an approximate Paretooptimal frontier quickly, iteratively improves it while time allows, and eventually finds the Paretooptimal frontier. It efficiently reuses the search effort from previous iterations and makes use of a novel pruning technique. Our experimental results show that ABOA*ε substantially outperforms baseline algorithms that do not reuse previous search effort, both in terms of runtime and number of node expansions. In fact, the most advanced variant of ABOA*ε even slightly outperforms BOA*, a stateoftheart biobjective search algorithm, for finding the Paretooptimal frontier. Moreover, given only a limited amount of deliberation time, ABOA*ε finds solutions that collectively approximate the Paretooptimal frontier much better than the solutions found by BOA*.
more »
« less
A*pex: Efficient Approximate MultiObjective Search on Graphs
In multiobjective search, edges are annotated with cost vectors consisting of multiple cost components. A path dominates another path with the same start and goal vertices iff the componentwise sum of the cost vectors of the edges of the former path is 'less than' the componentwise sum of the cost vectors of the edges of the latter path. The Paretooptimal solution set is the set of all undominated paths from a given start vertex to a given goal vertex. Its size can be exponential in the size of the graph being searched, which makes multiobjective search timeconsuming. In this paper, we therefore study how to find an approximate Paretooptimal solution set for a userprovided vector of approximation factors. The size of such a solution set can be significantly smaller than the size of the Paretooptimal solution set, which enables the design of approximate multiobjective search algorithms that are efficient and produce small solution sets. We present such an algorithm in this paper, called A*pex. A*pex builds on PPA*, a stateoftheart approximate biobjective search algorithm (where there are only two cost components) but (1) makes PPA* more efficient for biobjective search and (2) generalizes it to multiobjective search for any number of cost components. We first analyze the correctness of A*pex and then experimentally demonstrate its efficiency advantage over existing approximate algorithms for bi and triobjective search.
more »
« less
 Award ID(s):
 2121028
 NSFPAR ID:
 10350229
 Date Published:
 Journal Name:
 Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS)
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this


Many interesting search problems can be formulated as biobjective search problems, that is, search problems where two kinds of costs have to be minimized, for example, travel distance and time for transportation problems. Biobjective search algorithms have to maintain the set of undominated paths from the start state to each state to compute the set of paths from the start state to the goal state that are not dominated by some other path from the start state to the goal state (called the Paretooptimal solution set). Each time they find a new path to a state s, they perform a dominance check to determine whether this path dominates any of the previously found paths to s or whether any of the previously found paths to s dominates this path. Existing algorithms do not perform these checks efficiently. On the other hand, our BiObjective A* (BOA*) algorithm requires only constant time per check. In our experimental evaluation, we show that BOA* can run an order of magnitude (or more) faster than stateoftheart biobjective search algorithms, such as NAMOA*, NAMOA*dr, BiObjective Dijkstra, and Bidirectional BiObjective Dijkstra.more » « less

NA (Ed.)This work addresses a MultiObjective Shortest Path Problem (MOSPP) on a graph where the goal is to find a set of Paretooptimal solutions from a start node to a destination in the graph. A family of approaches based on MOA* have been developed to solve MOSPP in the literature. Typically, these approaches maintain a “frontier” set at each node during the search process to keep track of the nondominated, partial paths to reach that node. This search process becomes computationally expensive when the number of objectives increases as the number of Paretooptimal solutions becomes large. In this work, we introduce a new method to efficiently maintain these frontiers for multiple objectives by incrementally constructing balanced binary search trees within the MOA* search framework. We first show that our approach correctly finds the Paretooptimal front, and then provide extensive simulation results for problems with three, four and five objectives to show that our method runs faster than existing techniques by up to an order of magnitude.more » « less

We present a policy gradient method for MultiObjective Reinforcement Learning under unknown, linear preferences. By enforcing Pareto stationarity, a firstorder condition for Pareto optimality, we are able to design a simple policy gradient algorithm that approximates the Pareto front and infers the unknown preferences. Our method relies on a projected gradient descent solver that identifies common ascent directions for all objectives. Leveraging the solution of that solver, we introduce Pareto Policy Adaptation (PPA), a loss function that adapts the policy to be optimal with respect to any distribution over preferences. PPA uses implicit differentiation to backpropagate the loss gradient bypassing the operations of the projected gradient descent solver. Our approach is straightforward, easy to implement and can be used with all existing policy gradient and actorcritic methods. We evaluate our method in a series of reinforcement learning tasksmore » « less

We present a policy gradient method for MultiObjective Reinforcement Learning under unknown, linear preferences. By enforcing Pareto stationarity, a firstorder condition for Pareto optimality, we are able to design a simple policy gradient al gorithm that approximates the Pareto front and infers the unknown preferences. Our method relies on a projected gradient descent solver that identifies common ascent directions for all objectives. Leveraging the solution of that solver, we in troduce Pareto Policy Adaptation (PPA), a loss function that adapts the policy to be optimal with respect to any distribution over preferences. PPA uses implicit differentiation to backpropagate the loss gradient bypassing the operations of the projected gradient descent solver. Our approach is straightforward, easy to imple ment and can be used with all existing policy gradient and actorcritic methods. We evaluate our method in a series of reinforcement learning tasks.more » « less