I/O Characterization and Performance Evaluation of BeeGFS for Deep Learning

Chowdhury, Fahim; Zhu, Yue; Heer, Todd; Paredes, Saul; Moody, Adam; Goldstone, Robin; Mohror, Kathryn; Yu, Weikuan

doi:10.1145/3337821.3337902

Citation Details

I/O Characterization and Performance Evaluation of BeeGFS for Deep Learning

Parallel File Systems (PFSs) are frequently deployed on leadership High Performance Computing (HPC) systems to ensure efficient I/O, persistent storage and scalable performance. Emerging Deep Learning (DL) applications incur new I/O and storage requirements to HPC systems with batched input of small random files. This mandates PFSs to have commensurate features that can meet the needs of DL applications. BeeGFS is a recently emerging PFS that has grabbed the attention of the research and industry world because of its performance, scalability and ease of use. While emphasizing a systematic performance analysis of BeeGFS, in this paper, we present the architectural and system features of BeeGFS, and perform an experimental evaluation using cutting-edge I/O, Metadata and DL application benchmarks. Particularly, we have utilized AlexNet and ResNet-50 models for the classification of ImageNet dataset using the Livermore Big Artificial Neural Network Toolkit (LBANN), and ImageNet data reader pipeline atop TensorFlow and Horovod. Through extensive performance characterization of BeeGFS, our study provides a useful documentation on how to leverage BeeGFS for the emerging DL applications. more »

Award ID(s):: 1763547 1744336 1822737 1564647 1561041

PAR ID:: 10156298

Author(s) / Creator(s):: Chowdhury, Fahim; Zhu, Yue; Heer, Todd; Paredes, Saul; Moody, Adam; Goldstone, Robin; Mohror, Kathryn; Yu, Weikuan

Date Published:: 2019-08-05

Journal Name:: ICPP 2019: Proceedings of the 48th International Conference on Parallel Processing

Page Range / eLocation ID:: 1 to 10

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1145/3337821.3337902

More Like this