NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

FedCaSe: Enhancing Federated Learning with Heterogeneity-aware Caching and Scheduling

Khan, Redwan_Ibne_Seraj; Paul, Arnab_K; Cheng, Yue; Jian, Xun; Butt, Ali_R (November 2024, Proceedings of ACM Symposium on Cloud Computing)

Free, publicly-accessible full text available November 20, 2025
An End-to-end High-performance Deduplication Scheme for Docker Registries and Docker Container Storage Systems

https://doi.org/10.1145/3643819

Zhao, Nannan; Lin, Muhui; Albahar, Hadeel; Paul, Arnab K; Huan, Zhijie; Abraham, Subil; Chen, Keren; Tarasov, Vasily; Skourtis, Dimitrios; Anwar, Ali; et al (August 2024, ACM Transactions on Storage)

The wide adoption of Docker containers for supporting agile and elastic enterprise applications has led to a broad proliferation of container images. The associated storage performance and capacity requirements place a high pressure on the infrastructure ofcontainer registriesthat store and distribute images andcontainer storage systemson the Docker client side that manage image layers and store ephemeral data generated at container runtime. The storage demand is worsened by the large amount of duplicate data in images. Moreover, container storage systems that use Copy-on-Write (CoW) file systems as storage drivers exacerbate the redundancy. Exploiting the high file redundancy in real-world images is a promising approach to drastically reduce the growing storage requirements of container registries and improve the space efficiency of container storage systems. However, existing deduplication techniques significantly degrade the performance of both registries and container storage systems because of data reconstruction overhead as well as the deduplication cost. We propose DupHunter, an end-to-end deduplication scheme that deduplicates layers for both Docker registries and container storage systems while maintaining a high image distribution speed and container I/O performance. DupHunter is divided into three tiers: registry tier, middle tier, and client tier. Specifically, we first build a high-performance deduplication engine at the registry tier that not only natively deduplicates layers for space savings but also reduces layer restore overhead. Then, we use deduplication offloading at the middle tier to eliminate the redundant files from the client tier and avoid bringing deduplication overhead to the clients. To further reduce the data duplicates caused by CoWs and improve the container I/O performance, we utilize a container-aware storage system at the client tier that reserves space for each container and arranges the placement of files and their modifications on the disk to preserve locality. Under real workloads, DupHunter reduces storage space by up to 6.9× and reduces theGETlayer latency up to 2.8× compared to the state-of-the-art. Moreover, DupHunter can improve the container I/O performance by up to 93% for reads and 64% for writes.
more » « less
Full Text Available
Alps: An Adaptive Learning, Priority OS Scheduler for Serverless Function

Fu, Yuqi; Shi, Ruizhe; Wang, Haoliang; Chen, Songqing; Cheng, Yue (July 2024, Proceedings of the 2024 USENIX Annual Technical Conference)

Full Text Available
DyLeCT: Achieving Huge-page-like Translation Performance for Hardware-compressed Memory

https://doi.org/10.1109/ISCA59077.2024.00085

Panwar, Gagandeep; Laghari, Muhammad; Choukse, Esha; Jian, Xun (June 2024, IEEE)

Full Text Available
Tarazu: An Adaptive End-to-end I/O Load-balancing Framework for Large-scale Parallel File Systems

https://doi.org/10.1145/3641885

Paul, Arnab K; Neuwirth, Sarah; Wadhwa, Bharti; Wang, Feiyi; Oral, Sarp; Butt, Ali R (May 2024, ACM Transactions on Storage)

The imbalanced I/O load on large parallel file systems affects the parallel I/O performance of high-performance computing (HPC) applications. One of the main reasons for I/O imbalances is the lack of a global view of system-wide resource consumption. While approaches to address the problem already exist, the diversity of HPC workloads combined with different file striping patterns prevents widespread adoption of these approaches. In addition, load-balancing techniques should be transparent to client applications. To address these issues, we proposeTarazu, an end-to-end control plane where clients transparently and adaptively write to a set of selected I/O servers to achieve balanced data placement. Our control plane leverages real-time load statistics for global data placement on distributed storage servers, while our design model employs trace-based optimization techniques to minimize latency for I/O load requests between clients and servers and to handle multiple striping patterns in files. We evaluate our proposed system on an experimental cluster for two common use cases: the synthetic I/O benchmark IOR and the scientific application I/O kernel HACC-I/O. We also use a discrete-time simulator with real HPC application traces from emerging workloads running on the Summit supercomputer to validate the effectiveness and scalability ofTarazuin large-scale storage environments. The results show improvements in load balancing and read performance of up to 33% and 43%, respectively, compared to the state-of-the-art.
more » « less
Full Text Available
Application-Attuned Memory Management for Containerized HPC Workflows

https://doi.org/10.1109/IPDPS57955.2024.00019

Arif, Moiz; Maurya, Avinash; Rafique, M Mustafa; Nikolopoulos, Dimitrios S; Butt, Ali R (May 2024, IEEE)

Full Text Available
Application-Attuned Memory Management for Containerized HPC Workflows

Arif, Moiz; Maurya, Avinash; Rafique, M. Mustafa; Nikolopoulos, Dimitrios S.; Butt. Ali R. (May 2024, Proceedings of the 38th IEEE International Parallel & Distributed Processing Symposium (IPDPS))

Full Text Available
A Closer Look into IPFS: Accessibility, Content, and Performance

https://doi.org/10.1145/3656015

Shi, Ruizhe; Cheng, Ruizhi; Han, Bo; Cheng, Yue; Chen, Songqing (May 2024, Proceedings of the ACM on Measurement and Analysis of Computing Systems)

The InterPlanetary File System (IPFS) has recently gained considerable attention. While prior research has focused on understanding its performance characterization and application support, it remains unclear: (1) what kind of files/content are stored in IPFS, (2) who are providing these files, (3) are these files always accessible, and (4) what affects the file access performance. To answer these questions, in this paper, we perform measurement and analysis on over 4 million files associated with CIDs (content IDs) that appeared in publicly available IPFS datasets. Our results reveal the following key findings: (1) Mixed file accessibility: while IPFS is not designed for a permanent storage, accessing a non-trivial portion of files, such as those of NFTs and video streams, often requires multiple retrieval attempts, potentially blocking NFT transactions and negatively affecting the user experience. (2) Dominance of NFT (non-fungible token) and video files: about 50% of stored files are NFT-related, followed by a large portion of video files, among which about half are pirated movies and adult content. (3) Centralization of content providers: a small number of peers (top-50), mostly cloud nodes hosted by tech companies, serve a large portion (95%) of files, deviating from IPFS's intended design goal. (4) High variation of downloading throughput and lookup time: large file retrievals experience lower average throughput due to more overhead for resolving file chunk CIDs, and looking up files hosted by non-cloud nodes takes longer. We hope that our findings can offer valuable insights for (1) IPFS application developers to take into consideration these characteristics when building applications on top of IPFS, and (2) IPFS system developers to improve IPFS and similar systems to be developed for Web3.
more » « less
Full Text Available
FLOAT: Federated Learning Optimizations with Automated Tuning

https://doi.org/10.1145/3627703.3650081

Khan, Ahmad Faraz; Khan, Azal Ahmad; Abdelmoniem, Ahmed M; Fountain, Samuel; Butt, Ali R; Anwar, Ali (April 2024, ACM)

Full Text Available
FLOAT: Federated Learning Optimizations with Automated Tuning.

Khan, Ahmad Faraz; Khan, Azal Ahmad; Abdelmoniem, Ahmed M.; Fountain, Samuel; Butt, Ali R.; Anwar, Ali (April 2024, Nineteenth European Conference on Computer Systems (EuroSys ’24))

Full Text Available

« Prev Next »

Search for: All records