An official website of the United States government Here's how you know

Official websites use .gov

A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS

A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 10:00 PM ET on Friday, February 6 until 10:00 AM ET on Saturday, February 7 due to maintenance. We apologize for the inconvenience.

Search for: All records

Award ID contains: 2018627

« Prev Next »

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Scaling Large Language Model Training on Frontier with Low-Bandwidth Partitioning

Xu, L; Anthony, Q; Hatef; J; Shafi, A; Subramoni, H; Panda, DK (December 2024, IEEE International Conference on High Performance Computing, Data, and Analytics)

Full Text Available
Design and Implementation of Kernel-based MPI Reduction Operations for Intel GPUs

Chen, C; Kuncham, G; Subramoni, H; Panda, DK (December 2024, IEEE International Conference on High Performance Computing, Data, and Analytics)

Full Text Available
HyperSack: Distributed Hyperparameter Optimization for Deep Learning using Resource-Aware Scheduling on Heterogeneous GPU Systems

Alnaasan, N; Ramesh, B; Yao, J; Shafi, A; Subramoni, H; Panda, DK (December 2024, IEEE International Conference on High Performance Computing, Data, and Analytics)

Full Text Available
Using BlueField-3 SmartNICs to Offload Vector Operations in Krylov Subspace Methods

Suresh, K; Michalowicz, B; Contini, N; Ramesh, B; Abduljabbar, M; Shafi, A; Subramoni, H; Panda, DK (December 2024, IEEE International Conference on High Performance Computing, Data, and Analytics, Dec 2024)

Full Text Available
Masked LoGoNet: Fast and Accurate 3D Image Analysis for Medical Domain

https://doi.org/10.1145/3637528.3672069

Karimi_Monsefi, Amin; Karisani, Payam; Zhou, Mengxi; Choi, Stacey; Doble, Nathan; Ji, Heng; Parthasarathy, Srinivasan; Ramnath, Rajiv (August 2024, ACM)

Full Text Available
OHIO: Improving RDMA Network Scalability in MPI_Alltoall Through Optimized Hierarchical and Intra/Inter-Node Communication Overlap Design

https://doi.org/10.1109/HOTI63208.2024.00019

Tran, Tu; Kuncham, Goutham_Kalikrishna Reddy; Ramesh, Bharath; Xu, Shulei; Subramoni, Hari; Abduljabbar, Mustafa; Panda, Dhabaleswar_K (August 2024, IEEE)

Full Text Available
Demystifying the Communication Characteristics for Distributed Transformer Models

https://doi.org/10.1109/HOTI63208.2024.00020

Anthony, Quentin; Michalowicz, Benjamin; Hatef, Jacob; Xu, Lang; Abduljabbai, Mustafa; Shafi, Aamir; Subramoni, Hari; Panda, Dhabaleswar K (August 2024, IEEE)

Full Text Available
Characterizing Communication in Distributed Parameter-Efficient Fine-Tuning for Large Language Models

https://doi.org/10.1109/HOTI63208.2024.00014

Alnaasan, Nawras; Huang, Horng-Ruey; Shafi, Aamir; Subramoni, Hari; Panda, Dhabaleswar K (August 2024, IEEE)

Full Text Available
The Case for Co-Designing Model Architectures with Hardware

https://doi.org/10.1145/3673038.3673136

Anthony, Quentin; Hatef, Jacob; Narayanan, Deepak; Biderman, Stella; Bekman, Stas; Yin, Junqi; Shafi, Aamir; Subramoni, Hari; Panda, Dhabaleswar (August 2024, ACM)

Full Text Available
RR-Compound: RDMA-Fused gRPC for Low Latency, High Throughput, and Easy Interface

https://doi.org/10.1109/TPDS.2024.3404394

Geng, Liang; Wang, Hao; Meng, Jingsong; Fan, Dayi; Ben-Romdhane, Sami; Pichumani, Hari Kadayam; Phegade, Vinay; Zhang, Xiaodong (August 2024, IEEE Transactions on Parallel and Distributed Systems)

We have developed an open-source software called RR-Compound for low latency, high throughput, and easy interface for users.
more » « less
Full Text Available

« Prev Next »