MariusGNN: Resource-Efficient Out-of-Core Training of Graph Neural Networks

Waleffe, Roger; Mohoney, Jason; Rekatsinas, Theodoros; Venkataraman, Shivaram

Citation Details

We study training of Graph Neural Networks (GNNs) for large-scale graphs. We revisit the premise of using distributed training for billion-scale graphs and show that for graphs that fit in main memory or the SSD of a single machine, out- of-core pipelined training with a single GPU can outperform state-of-the-art (SoTA) multi-GPU solutions. We introduce MariusGNN, the first system that utilizes the entire storage hierarchy—including disk—for GNN training. MariusGNN introduces a series of data organization and algorithmic contributions that 1) minimize the end-to-end time required for training and 2) ensure that models learned with disk-based training exhibit accuracy similar to those fully trained in memory. We evaluate MariusGNN against SoTA systems for learning GNN models and find that single-GPU training in MariusGNN achieves the same level of accuracy up to 8× faster than multi-GPU training in these systems, thus, introducing an order of magnitude monetary cost reduction. MariusGNN is open-sourced at www.marius-project.org. more »

Award ID(s):: 1815538

PAR ID:: 10385071

Author(s) / Creator(s):: Waleffe, Roger; Mohoney, Jason; Rekatsinas, Theodoros; Venkataraman, Shivaram

Date Published:: 2023-05-08

Journal Name:: Eighteenth European Conference on Computer Systems (EuroSys ’23)

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this