NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Context-aware Prefetching for Near-Storage Accelerators

Zhang, Jian; Nguyen, Marie; Kashyap, Sanidhya; Kannan, Sudarsun (July 2024, ACM)

We present ContextPrefetcher, a host-guided high-performant prefetching framework for near-storage accelerators that prefetches data blocks from storage (e.g., NAND) to device-level RAM. Efficiently prefetching data blocks to device-level RAM reduces storage access costs and improves I/O performance. We introduce a novel abstraction, Cross-layered Context (CLC), a virtual entity that spans across the host and the device and is used for identifying, managing, and tracking active and inactive data such as files, objects (within object stores), or a range of blocks. To support efficient prefetching of actively used CLCs to device memory without incurring near-device resource (memory and compute) bottlenecks, ContextPrefetcher delegates prefetching management to the host, guiding near-device compute to prefetch blocks of active CLC. Finally, ContextPrefetcher facilitates the swift reclamation of blocks associated with inactive CLC. Preliminary evaluation against state-of-the-art near-storage accelerator designs demonstrates performance gains of up to 1.34X.
more » « less
Full Text Available
OmniCache: Collaborative Caching for Near-storage Accelerators

Zhang, Jian; Ren, Yujie; Nguyen, Marie; Min, Changwoo; Kannan, Sudarsun (February 2024, USENIX)

We propose OmniCache, a novel caching design for near-storage accelerators that combines near-storage and host memory capabilities to accelerate I/O and data processing. First, OmniCache introduces a “near-cache” approach, maximizing data access to the nearest cache for I/O and processing operations. Second, OmniCache presents collaborative caching for concurrent I/O and data processing by using host and device caches. Third, OmniCache incorporates a dynamic model-driven offloading support, which actively monitors hardware and software metrics for efficient processing across host and device processors. Finally, OmniCache explores the extensive- ability for the newly-introduced CXL, a memory expansion technology. OmniCache demonstrates significant performance gains of up to 3.24X for I/O workloads and 3.06X for data processing workloads.
more » « less
Full Text Available
FusionFS: Fusing I/O Operations using CISCOps in Firmware File Systems

Jian Zhang, Yujie Ren (April 2022, File and storage technologies)

We present FusionFS, a direct-access firmware-level in-storage filesystem that exploits the near-storage computational capability for fast I/O and data processing, consequently reducing I/O bottlenecks. In FusionFS, we introduce a new abstraction, CISCOps, that combines multiple I/O and data processing operations into one fused operation and offloaded for near-storage processing. By offloading, CISCOps significantly reduces dominant I/O overheads such as system calls, data movement, communication, and other software overheads. Further, to enhance the use of CISCOps, we introduce MicroTx for fine-grained crash consistency and fast (automatic) recovery of I/O and data processing operations. We also explore scheduling techniques to ensure fair and efficient use of in-storage compute and memory resources across tenants. Evaluation of FusionFS against the state-of-the-art user-level, kernel-level, and firmware-level file systems using microbenchmarks, macrobenchmarks, and real-world applications shows up to 6.12X, 5.09X and 2.07X performance gains, and 2.65X faster recovery for applications.
more » « less
Full Text Available
Scale and Performance in a Filesystem Semi-Microkernel

https://doi.org/10.1145/3477132.3483581

Liu, Jing; Rebello, Anthony; Dai, Yifan; Ye, Chenhao; Kannan, Sudarsun; Arpaci-Dusseau, Andrea C.; Arpaci-Dusseauu, Remzi H. (October 2021, SOSP '21: Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles)

Full Text Available
KLOCs: kernel-level object contexts for heterogeneous memory systems

https://doi.org/10.1145/3445814.3446745

Kannan, Sudarsun; Ren, Yujie; Bhattacharjee, Abhishek (April 2021, Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems)
null (Ed.)
Full Text Available
pFSCK: Accelerating File System Checking and Repair for Modern Storage

David Domingo, Sudarsun Kannan (February 2021, Proceedings of the 19th USENIX Conference on File and Storage Technologies)
null (Ed.)
Full Text Available
CrossFS: A Cross-layered Direct-Access File System

Yujie Ren, Changwoo Min (November 2020, 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2020))
null (Ed.)
Full Text Available
The Need for Precise and Efficient Memory Capacity Budgeting

https://doi.org/10.1145/3422575.3422791

Garg, Shaleen; Kannan, Sudarsun; Parashar, Manish (September 2020, MEMSYS 2020: The International Symposium on Memory Systems)

Full Text Available
CompoundFS: Compounding I/O Operations in Firmware File Systems

Yujie Ren, Jian Zhang (July 2020, 12th USENIX Workshop on Hot Topics in Storage and File Systems)

Full Text Available

Search for: All records