Towards Accelerating Data Intensive Application's Shuffle Process Using SmartNICs

Lin, Jiaxin; Ji, Tao; Hao, Xiangpeng; Cha, Hokeun; Le, Yanfang; Yu, Xiangyao; Akella, Aditya

doi:10.1145/3589980

Citation Details

Towards Accelerating Data Intensive Application's Shuffle Process Using SmartNICs

The wide adoption of the emerging SmartNIC technology creates new opportunities to offload application-level computation into the networking layer, which frees the burden of host CPUs, leading to performance improvement. Shuffle, the all-to-all data exchange process, is a critical building block for network communication in distributed data-intensive applications and can potentially benefit from SmartNICs. In this paper, we develop SmartShuffle, which accelerates the data-intensive application's shuffle process by offloading various computation tasks into the SmartNIC devices. SmartShuffle supports offloading both low-level network functions, including data partitioning and network transport, and high-level computation tasks, including filtering, aggregation, and sorting. SmartShuffle adopts a coordinated offload architecture to make sender-side and receiver-side SmartNICs jointly contribute to the benefits of shuffle computation offload. SmartShuffle carefully manages the tight and time-varying computation and memory constraints on the device. We propose a liquid offloading approach, which dynamically migrates operators between the host CPU and the SmartNIC at runtime such that resources in both devices are fully utilized. We prototype SmartShuffle on the Stingray SoC SmartNICs and plug it into Spark. Our evaluation shows that SmartShuffle improves host CPU efficiency and I/O efficiency with lower job completion time. SmartShuffle outperforms Spark, and Spark RDMA by up to 40% on TPC-H. more »

Award ID(s):: 2106199

PAR ID:: 10434985

Author(s) / Creator(s):: Lin, Jiaxin; Ji, Tao; Hao, Xiangpeng; Cha, Hokeun; Le, Yanfang; Yu, Xiangyao; Akella, Aditya

Date Published:: 2023-05-19

Journal Name:: Proceedings of the ACM on Measurement and Analysis of Computing Systems

Volume:: 7

Issue:: 2

ISSN:: 2476-1249

Page Range / eLocation ID:: 1 to 23

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1145/3589980

More Like this