skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 10:00 PM ET on Friday, February 6 until 10:00 AM ET on Saturday, February 7 due to maintenance. We apologize for the inconvenience.


Title: BYO: A Unified Framework for Benchmarking Large-Scale Graph Containers
A fundamental building block in any graph algorithm is agraph container -- a data structure used to represent the graph. Ideally, a graph container enables efficient access to the underlying graph, has low space usage, and supports updating the graph efficiently. In this paper, we conduct an extensive empirical evaluation of graph containers designed to support running algorithms on large graphs. To our knowledge, this is the firstapples-to-applescomparison of graph containers rather than overall systems, which include confounding factors such as differences in algorithm implementations and infrastructure. We measure the running time of 10 highly-optimized algorithms across over 20 different containers and 10 graphs. Somewhat surprisingly, we find that the average algorithm running time does not differ much across containers, especially those that support dynamic updates. Specifically, a simple container based on an off-the-shelf B-tree is only 1.22× slower on average than a highly optimized static one. Moreover, we observe that simplifying a graph-container Application Programming Interface (API) to only a few simple functions incurs a mere 1.16× slowdown compared to a complete API. Finally, we also measure batch-insert throughput in dynamic-graph containers for a full picture of their performance. To perform the benchmarks, we introduce BYO, a unified framework that standardizes evaluations of graph-algorithm performance across different graph containers. BYO extends the Graph Based Benchmark Suite (Dhulipala et al. 18), a state-of-the-art graph algorithm benchmark, to easily plug into different dynamic graph containers and enable fair comparisons between them on a large suite of graph algorithms. While several graph algorithm benchmarks have been developed to date, to the best of our knowledge, BYO is the first system designed to benchmark graph containers.  more » « less
Award ID(s):
2317194
PAR ID:
10609057
Author(s) / Creator(s):
; ; ; ; ; ;
Publisher / Repository:
Proceedings of the VLDB Endowment
Date Published:
Journal Name:
Proceedings of the VLDB Endowment
Volume:
17
Issue:
9
ISSN:
2150-8097
Page Range / eLocation ID:
2307 to 2320
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We introduce the ParClusterers Benchmark Suite (PCBS)---a collection of highly scalable parallel graph clustering algorithms and benchmarking tools that streamline comparing different graph clustering algorithms and implementations. The benchmark includes clustering algorithms that target a wide range of modern clustering use cases, including community detection, classification, and dense subgraph mining. The benchmark toolkit makes it easy to run and evaluate multiple instances of different clustering algorithms with respect to both the running time and quality. We evaluate the PCBS algorithms empirically and find that they deliver both the state of the art quality and the running time. In terms of the running time, they are on average over 4x faster than the fastest library we compared to. In terms of quality, the correlation clustering algorithm [Shi et al., VLDB'21] optimizing for the LambdaCC objective, which does not have a direct counterpart in other libraries, delivers the highest quality in the majority of datasets that we used. 
    more » « less
  2. Container systems (e.g., Docker) provide a well-defined, lightweight, and versatile foundation to streamline the process of tool deployment, to provide a consistent and repeatable experimental interface, and to leverage data centers in the global cloud infrastructure as measurement vantage points. However, the virtual network devices commonly used to connect containers to the Internet are known to impose latency overheads which distort the values reported by measurement tools running inside containers. In this study, we develop a tool called MACE to measure and remove the latency overhead of virtual network devices as used by Docker containers. A key insight of MACE is the fact that container functions all execute in the same kernel. Based on this insight, MACE is implemented as a Linux kernel module using the trace event subsystem to measure latency along the network stack code path. Using CloudLab, we evaluate MACE by comparing the ping measurements emitted from a slim-ping container to the ones emitted using the same tool running in the bare metal machine under varying traffic loads. Our evaluation shows that the MACE-adjusted RTT measurements are within 20 µs of the bare metal ping RTTs on average while incurring less than 25 µs RTT perturbation. We also compare RTT perturbation incurred by MACE with perturbation incurred by the built-in ftrace kernel tracing system and find that MACE incurs less perturbation. 
    more » « less
  3. Evaluations—encompassing computational evaluations, benchmarks and user studies—are essential tools for validating the performance and applicability of graph and network layout algorithms (also known as graph drawing). These evaluations not only offer significant insights into an algorithm's performance and capabilities, but also assist the reader in determining if the algorithm is suitable for a specific purpose, such as handling graphs with a high volume of nodes or dense graphs. Unfortunately, there is no standard approach for evaluating layout algorithms. Prior work holds a ‘Wild West’ of diverse benchmark datasets and data characteristics, as well as varied evaluation metrics and ways to report results. It is often difficult to compare layout algorithms without first implementing them and then running your own evaluation. In this systematic review, we delve into the myriad of methodologies employed to conduct evaluations—the utilized techniques, reported outcomes and the pros and cons of choosing one approach over another. Our examination extends beyond computational evaluations, encompassing user‐centric evaluations, thus presenting a comprehensive understanding of algorithm validation. This systematic review—and its accompanying website—guides readers through evaluation types, the types of results reported, and the available benchmark datasets and their data characteristics. Our objective is to provide a valuable resource for readers to understand and effectively apply various evaluation methods for graph layout algorithms. A free copy of this paper and all supplemental material is available at osf.io, and the categorized papers are accessible on our website at https://visdunneright.github.io/gd-comp-eval/ 
    more » « less
  4. Cryptographic (crypto) API misuses often cause security vulnerabilities, so static and dynamic analyzers were recently proposed to detect such misuses. These analyzers differ in strengths and weaknesses, and they can miss bugs. Motivated by the inherent limitations of existing analyzers, we study runtime verification (RV) as an alternative for crypto API misuse detection. RV monitors program runs against formal specifications and was shown to be effective and efficient for amplifying the bug-finding ability of software tests. We focus on the popular JCA crypto API and write 22 RV specifications based on expert-validated rules in a static analyzer. We monitor these specifications while running tests in five benchmarks. Lastly, we compare the accuracy of our RV-based approach, RVSec, with those of three state-of-the-art crypto API misuses detectors: CogniCrypt, CryptoGuard, and CryLogger. RVSec has higher accuracy in four benchmarks and is on par with CryptoGuard in the fifth. Overall, RVSec achieves an average F1 measure of 95%, compared with 83%, 78%, and 86% for CogniCrypt, CryptoGuard, and CryLogger, respectively. We show that RV is effective for detecting crypto API misuses and highlight the strengths and limitations of these tools. We also discuss how static and dynamic analysis can complement each other for detecting crypto API misuses. 
    more » « less
  5. Identifying key members in large social network graphs is an important graph analytic. Recently, a new centrality measure called triangle centrality finds members based on the triangle support of vertices in graph. In this paper, we describe our rapid implementation of triangle centrality using Graph-BLAS, an API specification for describing graph algorithms in the language of linear algebra. We use triangle centrality’s algebraic algorithm and easily implement it using the SuiteSparse GraphBLAS library. A set of experiments on large, sparse graph datasets is conducted to verify the implementation. 
    more » « less