Accelerating Machine Learning Inference with GPUs in ProtoDUNE Data Processing

Cai, Tejin; Herner, Kenneth; Yang, Tingjun; Wang, Michael; Acosta Flechas, Maria; Harris, Philip; Holzman, Burt; Pedro, Kevin; Tran, Nhan

doi:10.1007/s41781-023-00101-0

Citation Details

Accelerating Machine Learning Inference with GPUs in ProtoDUNE Data Processing

Abstract

We study the performance of a cloud-based GPU-accelerated inference server to speed up event reconstruction in neutrino data batch jobs. Using detector data from the ProtoDUNE experiment and employing the standard DUNE grid job submission tools, we attempt to reprocess the data by running several thousand concurrent grid jobs, a rate we expect to be typical of current and future neutrino physics experiments. We process most of the dataset with the GPU version of our processing algorithm and the remainder with the CPU version for timing comparisons. We find that a 100-GPU cloud-based server is able to easily meet the processing demand, and that using the GPU version of the event processing algorithm is two times faster than processing these data with the CPU version when comparing to the newest CPUs in our sample. The amount of data transferred to the inference server during the GPU runs can overwhelm even the highest-bandwidth network switches, however, unless care is taken to observe network facility limits or otherwise distribute the jobs to multiple sites. We discuss the lessons learned from this processing campaign and several avenues for future improvements.

Award ID(s):: 2117997

NSF-PAR ID:: 10471126

Author(s) / Creator(s):: Cai, Tejin; Herner, Kenneth; Yang, Tingjun; Wang, Michael; Acosta Flechas, Maria; Harris, Philip; Holzman, Burt; Pedro, Kevin; Tran, Nhan

Publisher / Repository:: Springer Science + Business Media

Date Published:: 2023-10-27

Journal Name:: Computing and Software for Big Science

Volume:: 7

Issue:: 1

ISSN:: 2510-2036

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Journal Article:
https://doi.org/10.1007/s41781-023-00101-0

More Like this