skip to main content


Title: High-Performance Truss Analysis in Arkouda
In graph analytics, a truss is a cohesive subgraph based on the number of triangles supporting each edge. It is widely used for community detection applications such as social networks and security analysis, and the performance of truss analytics highly depends on its triangle counting method. This paper proposes a novel triangle counting kernel named Minimum Search (MS). Minimum Search can select two smaller adjacency lists out of three and uses fine-grained parallelism to improve the performance of triangle counting. Then, two basic algorithms, MS-based triangle counting, and MS-based support updating are developed. Based on the novel triangle counting kernel and the two basic algorithms above, three fundamental parallel truss analytics algorithms are designed and implemented to enable different kinds of graph truss analysis. These truss algorithms include an optimized K-Truss algorithm, a Max-Truss algorithm, and a Truss Decomposition algorithm. Moreover, all proposed algorithms have been implemented in the parallel language Chapel and integrated into an open-source framework, Arkouda. Through Arkouda, data scientists can efficiently conduct graph analysis through an easy-to-use Python interface and handle large-scale graph data in powerful back-end computing resources. Experimental results show that the proposed methods can significantly improve the performance of truss analysis on real-world graphs compared with the existing and widely adopted list intersection-based method. The implemented code is publicly available from GitHub (https://github.com/Bears-R-Us/arkoudanjit). }  more » « less
Award ID(s):
2109988
PAR ID:
10385344
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
The 29th IEEE International Conference on High Performance Computing, Data, and Analytics (HiPC)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The K-Truss of a graph is a cohesive subgraph that has been widely used for community detection in applications such as social networks and security analysis. In this paper, we first propose one optimized triangle search kernel with a few operations that can be used in both triangle counting and triangle search to replace the existing list intersection method. Based on the optimized kernel, three truss analytics algorithms, an optimized K-Truss parallel algorithm, a maximal K-Truss parallel algorithm, and a Truss decomposition parallel algorithm, are developed to efficiently enable different kinds of graph analysis. Moreover, all proposed parallel algorithms have been implemented in the highly-productive parallel language Chapel and integrated into the open-source framework Arkouda. Experimental results compared with the existing list intersection-based method show that for both synthetic and real-world graphs, the proposed method can significantly improve the performance of truss analysis on large graphs. The implemented method is publicly available from GitHub. 
    more » « less
  2. Due to the emergence of massive real-world graphs, whose sizes may extend to terabytes, new tools must be developed to enable data scientists to handle such graphs efficiently. These graphs may include social networks, computer networks, and genomes. In this paper, we propose a novel graph package, Arachne, to make large-scale graph analytics more effortless and efficient based on the open-source Arkouda framework. Arkouda has been developed to allow users to perform massively parallel computations on distributed data with an interface similar to NumPy. In this package, we developed a fundamental sparse graph data structure and then built several useful graph algorithms around our data structure to form a basic algorithmic library. Benchmarks and tools were also developed to evaluate and demonstrate the use of our graph algorithms. The graph algorithms we have implemented thus far include breadth-first search (BFS), connected components (CC), k-Truss (KT), Jaccard coefficients (JC), triangle counting (TC), and triangle centrality (TCE). Their corresponding experimental results based on realworld and synthetic graphs are presented. Arachne is organized as an Arkouda extension package and is publicly available on GitHub (https://github.com/Bears-R-Us/arkouda-njit). 
    more » « less
  3. Exploratory graph analytics helps maximize the informational value for a graph. However, the increasing graph size makes it impossible for existing popular exploratory data analysis tools to handle dozens-of-terabytes or even larger data sets in the memory of a common laptop/personal computer. Arkouda is a framework under early-development that brings together the productivity of Python at the user side with the high-performance of Chapel at the server side. In this paper, the preliminary work on overcoming the memory limit and high performance computing coding roadblock for high level Python users to perform large graph analysis is presented. A simple and succinct graph data structure design and implementation at both the Python front-end and the Chapel back-end in the Arkouda framework are provided. A typical graph algorithm, Breadth-First Search (BFS), is used to show how we can use Chapel to develop high performance parallel graph algorithm productively. Two Chapel based parallel Breadth-First Search (BFS) algorithms, one high level version and one corresponding low level version, have been implemented in Arkouda to support analyzing large graphs. Multiple graph benchmarks are used to evaluate the performance of the provided graph algorithms. Experimental results show that we can optimize the performance by tuning the selection of different Chapel high level data structures and parallel constructs. Our code is open source and available from GitHub (https://github.com/Bader-Research/arkouda). 
    more » « less
  4. Exploratory graph analytics helps maximize the informational value from a graph. However, the increasing graph size makes it impossible for existing popular exploratory data analysis tools to handle dozens-of-terabytes or even larger data sets in the memory of a common laptop/personal computer. Arkouda is a framework under early-development that brings together the productivity of Python at the user side with the high-performance of Chapel at the server side. In this paper, we present preliminary work on overcoming the memory limit and high performance computing coding roadblock for high level Python users to perform large graph analysis. A simple and succinct graph data structure design and implementation at both the Python front-end and the Chapel back-end in the Arkouda framework are provided. A typical graph algorithm, Breadth-First Search (BFS), is used to show how we can use Chapel to develop high performance parallel graph algorithm productively. Two Chapel-based parallel Breadth-First Search (BFS) algorithms, one high level version and one corresponding low level version, have been implemented in Arkouda to support analyzing large graphs. Multiple graph benchmarks are used to evaluate the performance of the provided graph algorithms. Ex- perimental results show that we can optimize the performance by tuning the selection of different Chapel high level data structures and parallel constructs. Our code is open source and available from GitHub (https://github.com/Bader-Research/arkouda). 
    more » « less
  5. Finding connected components is a fundamental problem in graph analysis. We develop a novel minimum- mapping based Contour algorithm to solve the connectivity problem. The Contour algorithm can identify all connected components of an undirected graph within O (log 𝑑𝑚𝑎𝑥 ) iterations on 𝑚 parallel processors, where 𝑑𝑚𝑎𝑥 is the largest diameter of all components in a given graph and 𝑚 is the total number of edges of the given graph. Furthermore, each iteration can easily be parallelized by employing the highly efficient minimum-mapping operator on all edges. To improve performance, the Contour algorithm is further optimized through asynchronous updates and simplified atomic operations. Our algorithm has been integrated into an open-source framework, Arachne, that extends Arkouda for large-scale interactive graph analytics with a Python API powered by the high-productivity parallel language Chapel. Experimental results on real-world and synthetic graphs show that the proposed Contour algorithm needs less number of iterations and can achieve 5.26 folds of speedup on average compared with the state-of-the-art connected component method FastSV implemented in Chapel. All code is publicly available on GitHub (https://github.com/Bears-R-Us/arkouda-njit). 
    more » « less