This content will become publicly available on April 1, 2024
- Award ID(s):
- NSF-PAR ID:
- Date Published:
- Journal Name:
- Proceedings of the 27th International Conference on Research in Computational Molecular Biology (RECOMB 2023)
- Page Range / eLocation ID:
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
Abstract Background Cell signaling pathways, which are a series of reactions that start at receptors and end at transcription factors, are basic to systems biology. Properly modeling the reactions in such pathways requires directed hypergraphs , where an edge is now directed between two sets of vertices. Inferring a pathway by the most parsimonious series of reactions corresponds to finding a shortest hyperpath in a directed hypergraph, which is NP-complete. The current state-of-the-art for shortest hyperpaths in cell signaling hypergraphs solves a mixed-integer linear program to find an optimal hyperpath that is restricted to be acyclic, and offers no efficiency guarantees. Results We present, for the first time, a heuristic for general shortest hyperpaths that properly handles cycles , and is guaranteed to be efficient . We show the heuristic finds provably optimal hyperpaths for the class of singleton-tail hypergraphs, and also give a practical algorithm for tractably generating all source-sink hyperpaths. The accuracy of the heuristic is demonstrated through comprehensive experiments on all source-sink instances from the standard NCI-PID and Reactome pathway databases, which show it finds a hyperpath that matches the state-of-the-art mixed-integer linear program on over 99% of all instances that are acyclic. On instances where only cyclic hyperpaths exist, the heuristic surpasses the state-of-the-art, which finds no solution; on every such cyclic instance, enumerating all source-sink hyperpaths shows the solution found by the heuristic was in fact optimal . Conclusions The new shortest hyperpath heuristic is both fast and accurate . This makes finding source-sink hyperpaths, which in general may contain cycles, now practical for real cell signaling networks. Availability Source code for the hyperpath heuristic in a new tool we call (as well as for hyperpath enumeration, and all dataset instances) is available free for non-commercial use at .more » « less
Carbone, Alessandra ; El-Kebir, Mohammed (Ed.)Cell signaling pathways, which are a series of reactions that start at receptors and end at transcription factors, are basic to systems biology. Properly modeling the reactions in such pathways requires directed hypergraphs, where an edge is now directed between two sets of vertices. Inferring a pathway by the most parsimonious series of reactions then corresponds to finding a shortest hyperpath in a directed hypergraph, which is NP-complete. The state of the art for shortest hyperpaths in cell-signaling hypergraphs solves a mixed-integer linear program to find an optimal hyperpath that is restricted to be acyclic, and offers no efficiency guarantees. We present for the first time a heuristic for general shortest hyperpaths that properly handles cycles, and is guaranteed to be efficient. Its accuracy is demonstrated through exhaustive experiments on all instances from the standard NCI-PID and Reactome pathway databases, which show the heuristic finds a hyperpath that matches the state-of-the-art mixed-integer linear program on over 99% of all instances that are acyclic. On instances where only cyclic hyperpaths exist, the heuristic surpasses the state-of-the-art, which finds no solution; on every such cyclic instance, enumerating all possible hyperpaths shows that the solution found by the heuristic is in fact optimal.more » « less
We consider a variant of the vehicle routing problem (VRP) where each customer has a unit demand and the goal is to minimize the total cost of routing a fleet of capacitated vehicles from one or multiple depots to visit all customers. We propose two parallel algorithms to efficiently solve the column-generation-based linear-programming relaxation for this VRP. Specifically, we focus on algorithms for the “pricing problem,” which corresponds to the resource-constrained elementary shortest path problem. The first algorithm extends the pulse algorithm for which we derive a new bounding scheme on the maximum load of any route. The second algorithm is based on random coloring from parameterized complexity which can be also combined with other techniques in the literature for improving VRPs, including cutting planes and column enumeration. We conduct numerical studies using VRP benchmarks (with 50–957 nodes) and instances of a medical home care delivery problem using census data in Wayne County, Michigan. Using parallel computing, both pulse and random coloring can significantly improve column generation for solving the linear programming relaxations and we can obtain heuristic integer solutions with small optimality gaps. Combining random coloring with column enumeration, we can obtain improved integer solutions having less than 2% optimality gaps for most VRP benchmark instances and less than 1% optimality gaps for the medical home care delivery instances, both under a 30-minute computational time limit. The use of cutting planes (e.g., robust cuts) can further reduce optimality gaps on some hard instances, without much increase in the run time. Summary of Contribution: The vehicle routing problem (VRP) is a fundamental combinatorial problem, and its variants have been studied extensively in the literature of operations research and computer science. In this paper, we consider general-purpose algorithms for solving VRPs, including the column-generation approach for the linear programming relaxations of the integer programs of VRPs and the column-enumeration approach for seeking improved integer solutions. We revise the pulse algorithm and also propose a random-coloring algorithm that can be used for solving the elementary shortest path problem that formulates the pricing problem in the column-generation approach. We show that the parallel implementation of both algorithms can significantly improve the performance of column generation and the random coloring algorithm can improve the solution time and quality of the VRP integer solutions produced by the column-enumeration approach. We focus on algorithmic design for VRPs and conduct extensive computational tests to demonstrate the performance of various approaches.more » « less
A factory in a metabolic network specifies how to produce target molecules from source compounds through biochemical reactions, properly accounting for reaction stoichiometry to conserve or not deplete intermediate metabolites. While finding factories is a fundamental problem in systems biology, available methods do not consider the number of reactions used, nor address negative regulation.
We introduce the new problem of finding optimal factories that use the fewest reactions, for the first time incorporating both first- and second-order negative regulation. We model this problem with directed hypergraphs, prove it is NP-complete, solve it via mixed-integer linear programming, and accommodate second-order negative regulation by an iterative approach that generates next-best factories.
This optimization-based approach is remarkably fast in practice, typically finding optimal factories in a few seconds, even for metabolic networks involving tens of thousands of reactions and metabolites, as demonstrated through comprehensive experiments across all instances from standard reaction databases.
Availability and implementation
Source code for an implementation of our new method for optimal factories with negative regulation in a new tool called Odinn, together with all datasets, is available free for non-commercial use at http://odinn.cs.arizona.edu.
Speed is essential in wildlife surveys due to the dynamic movement of animals throughout their environment and potentially extreme changes in weather. In this work, we present a multirobot path-planning method for conducting aerial surveys over large areas designed to make the best use of limited flight time. Unlike current survey path-planning solutions based on geometric patterns or integer programs, we solve a series of satisfiability modulo theory instances of increasing complexity. Each instance yields a set of feasible paths at each iteration and recovers the set of shortest paths after sufficient time. We implemented our planning algorithm with a team of drones to conduct multiple photographic aerial wildlife surveys of Cape Crozier, one of the largest Adélie penguin colonies in the world containing more than 300,000 nesting pairs. Over 2 square kilometers was surveyed in about 3 hours. In contrast, previous human-piloted single-drone surveys of the same colony required over 2 days to complete. Our method reduces survey time by limiting redundant travel while also allowing for safe recall of the drones at any time during the survey. Our approach can be applied to other domains, such as wildfire surveys in high-risk weather conditions or disaster response.