HeteroBench: Multi-kernel Benchmarks for Heterogeneous Systems

Tian, Hongzheng (ORCID:0009000462535059); Mishra, Alok (ORCID:0000000331401088); Chen, Zhiheng (ORCID:0009000771808655); Hong_Enriquez, Rolando P (ORCID:0009000856524408); Milojicic, Dejan (ORCID:0000000198308588); Frachtenberg, Eitan (ORCID:0000000237091829); Huang, Sitao (ORCID:0000000176691467)

doi:10.1145/3676151.3719366

The end of Moore’s Law and Dennard scaling has driven the proliferation of heterogeneous systems with accelerators, including CPUs, GPUs, and FPGAs, each with distinct architectures, compilers, and programming environments. GPUs excel at massively parallel processing for tasks like deep learning training and graphics rendering, while FPGAs offer hardware-level flexibility and energy efficiency for low-latency, high-throughput applications. In contrast, CPUs, while general-purpose, often fall short in high-parallelism or power-constrained applications. This architectural diversity makes it challenging to compare these accelerators effectively, leading to uncertainty in selecting optimal hardware and software tools for specific applications. To address this challenge, we introduce HeteroBench, a versatile benchmark suite for heterogeneous systems. HeteroBench allows users to evaluate multi-compute kernel applications across various accelerators, including CPUs, GPUs (from NVIDIA, AMD, Intel), and FPGAs (AMD), supporting programming environments of Python, Numba-accelerated Python, serial C++, OpenMP (both CPUs and GPUs), OpenACC and CUDA for GPUs, and Vitis HLS for FPGAs. This setup enables users to assign kernels to suitable hardware platforms, ensuring comprehensive device comparisons. What makes HeteroBench unique is its vendor-agnostic, cross-platform approach, spanning diverse domains such as image processing, machine learning, numerical computation, and physical simulation, ensuring deeper insights for HPC optimization. Extensive testing across multiple systems provides practical reference points for HPC practitioners, simplifying hardware selection and performance tuning for both developers and end-users alike. This suite may assist to make more informed decision on AI/ML deployment and HPC development, making it an invaluable resource for advancing academic research and industrial applications.

More Like this