We present the first GPU-based parallel algorithm to efficiently update vertex coloring on large dynamic networks. For single GPU, we introduce the concept of loosely maintained vertex color update that reduces computation and memory requirements. For multiple GPUs, in distributed environments, we propose priority-based ordering of vertices to reduce the communication time. We prove the correctness of our algorithms and experimentally demonstrate that for graphs of over 16 million vertices and over 134 million edges on a single GPU, our dynamic algorithm is as much as 20x faster than state-of-the-art algorithm on static graphs. For larger graphs with over 130 million vertices and over 260 million edges, our distributed implementation with 8 GPUs produces updated color assignments within 160 milliseconds. In all cases, the proposed parallel algorithms produce comparable or fewer colors than state-of-the-art algorithms.
more »
« less
Rapid Analysis of Network Connectivity
This research focuses on accelerating the computational time of two base network algorithms (k-simple shortest paths and minimum spanning tree for a subset of nodes)---cornerstones behind a variety of network connectivity mining tasks---with the goal of rapidly finding networkpathways andtrees using a set of user-specific query nodes. To facilitate this process we utilize: (1) multi-threaded algorithm variations, (2) network re-use for subsequent queries and (3) a novel algorithm, Key Neighboring Vertices (KNV), to reduce the network search space. The proposed KNV algorithm serves a dual purpose: (a) to reduce the computation time for algorithmic analysis and (b) to identify key vertices in the network (\textit ). Empirical results indicate this combination of techniques significantly improves the baseline performance of both algorithms. We have also developed a web platform utilizing the proposed network algorithms to enable researchers and practitioners to both visualize and interact with their datasets (PathFinder: http://www.path-finder.io.)
more »
« less
- PAR ID:
- 10062416
- Date Published:
- Journal Name:
- ACM CIKM
- Page Range / eLocation ID:
- 2463 to 2466
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Abstract A novel, exact algorithm is presented to solve the path planning problem that involves finding the shortest collision-free path from a start to a goal point in a two-dimensional environment containing convex and non-convex obstacles. The proposed algorithm, which is called the shortest possible path (SPP) algorithm, constructs a network of lines connecting the vertices of the obstacles and the locations of the start and goal points which is smaller than the network generated by the visibility graph. Then it finds the shortest path from start to goal point within this network. The SPP algorithm generates a safe, smooth and obstacle-free path that has a desired distance from each obstacle. This algorithm is designed for environments that are populated sparsely with convex and nonconvex polygonal obstacles. It has the capability of eliminating some of the polygons that do not play any role in constructing the optimal path. It is proven that the SPP algorithm can find the optimal path in O(nn r2 ) time, where n is the number of vertices of all polygons and n ̓ is the number of vertices that are considered in constructing the path network (n ̓ ≤ n). The performance of the algorithm is evaluated relative to three major classes of algorithms: heuristic, probabilistic, and classic. Different benchmark scenarios are used to evaluate the performance of the algorithm relative to the first two classes of algorithms: GAMOPP (genetic algorithm for multi-objective path planning), a representative heuristic algorithm, as well as RRT (rapidly-exploring random tree) and PRM (probabilistic road map), two well-known probabilistic algorithms. Time complexity is known for classic algorithms, so the presented algorithm is compared analytically.more » « less
-
We consider the maximum vertex-weighted matching problem (MVM), in which non-negative weights are assigned to the vertices of a graph, and the weight of a matching is the sum of the weights of the matched vertices. Although exact algorithms for MVM are faster than exact algorithms for the maximum edge-weighted matching problem, there are graphs on which these exact algorithms could take hundreds of hours. For a natural number k, we design a k/(k + 1)approximation algorithm for MVM on non-bipartite graphs that updates the matching along certain short paths in the graph: either augmenting paths of length at most 2k + 1 or weight-increasing paths of length at most 2k. The choice of k = 2 leads to a 2/3-approximation algorithm that computes nearly optimal weights fast. This algorithm could be initialized with a 2/3-approximate maximum cardinality matching to reduce its runtime in practice. A 1/2-approximation algorithm may be obtained using k = 1, which is faster than the 2/3-approximation algorithm but it computes lower weights. The 2/3-approximation algorithm has time complexity O(Δ2m) while the time complexity of the 1/2-approximation algorithm is O(Δm), where m is the number of edges and Δ is the maximum degree of a vertex. Results from our serial implementations show that on average the 1/2-approximation algorithm runs faster than the Suitor algorithm (currently the fastest 1/2-approximation algorithm) while the 2/3-approximation algorithm runs as fast as the Suitor algorithm but obtains higher weights for the matching. One advantage of the proposed algorithms is that they are well-suited for parallel implementation since they can process vertices to match in any order. The 1/2- and 2/3-approximation algorithms have been implemented on a shared memory parallel computer, and both approximation algorithms obtain good speedups, while the former algorithm runs faster on average than the parallel Suitor algorithm. Care is needed to design the parallel algorithm to avoid cyclic waits, and we prove that it is live-lock free.more » « less
-
MapReduce jobs need to shuffle a large amount of data over the network between mapper and reducer nodes. The shuffle time accounts for a big part of the total running time of the MapReduce jobs. Therefore, optimizing the makespan of shuffle phase can greatly improve the performance of MapReduce jobs. A large fraction of production jobs in data centers are recurring with predictable characteristics, and the recurring jobs split the network into periodic busy and idle time slots, which allows us to better schedule the shuffle data in order to reduce the makespan of shuffle phase with the future predictable network status available. In this paper, we formulate the shuffle scheduling problem with the aim to minimize the makespan of MapReduce shuffle phase by leveraging the predictable periodic network status. We then propose a simple yet effective network-aware shuffle scheduling algorithm (NAS) to reduce the number of idle time slots required to transfer the shuffle data so as to reduce the shuffle makespan. We also prove that the proposed algorithm NAS is a 32 -approximation algorithm to the shuffle scheduling problem when all the future idle time slots have the same duration. We finally conduct experiments through simulations. Experimental results demonstrate the proposed algorithm can effectively reduce the makespan of MapReduce shuffle phase and increase network utilization.more » « less
-
There is a rich theory and plethora of algorithms in the literature aiming at the efficient scheduling of stochastic networks. These solutions are predominantly designed under the assumption of traffic demands that are independently generated at network nodes, without any requirement for synchronization among their received services. In this work, we note that many applications, including cloud computing, virtual reality, gaming, autonomous vehicular networks and collaborative design, generate traffic simultaneously at multiple nodes when they arrive, with possibly non-uniform file sizes, whose performance relies on the synchronous completion of the traffic across the network. This calls for the design of new scheduling algorithms that aims to coordinate the service of packets of the same traffic across the network. Towards this end, we propose a novel scheduling algorithm that not only accounts for the heterogeneity of the file size distributions, but also works towards synchronizing the completion time of the same traffic stream across the network. This is achieved by employing two insights that emanate from key motivating examples we develop: (1) the normalization of traffic load with respect to the non-uniform file sizes; and (2) the incorporation of deviation of normalized loads across network nodes that serve synchronized traffic. After establishing the throughput-optimality of our algorithm in general stochastic networks, we perform extensive simulations under various (spanning both wired and wireless) settings to reveal the potential completion time gains that it yields over other throughput-optimal strategies designed under the assumption of independent traffic generation.more » « less
An official website of the United States government

