NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

MEGA: More Efficient Graph Attention for GNNs

Deng, Weishu; Rao, Jia (July 2024, The Institute of Electrical and Electronics Engineers (IEEE))

Full Text Available
Nomad: Non-Exclusive Memory Tiering via Transactional Page Migration

Xiang, Lingfeng; Lin, Zhen; Deng, Weishu; Lu, Hui; Rao, Jia (July 2024, The USENIX Advanced Computing Systems Association)

Full Text Available
Nomad: Non-Exclusive Memory Tiering via Transactional Page Migration

Xiang, Lingfeng; Lin, Zhen; Deng, Weishu; Lu, Hui; Rao, Jia Rao; Yuan, Yifan; Wang, Ren (July 2024, The Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation (OSDI'24).)

With the advent of byte-addressable memory devices, such as CXLmemory, persistent memory, and storage-class memory, tiered memory systems have become a reality. Page migration is the de facto method within operating systems for managing tiered memory. It aims to bring hot data whenever possible into fast memory to optimize the performance of data accesses while using slow memory to accommodate data spilled from fast memory. While the existing research has demonstrated the effectiveness of various optimizations on page migration, it falls short of addressing a fundamental question: Is exclusive memory tiering, in which a page is either present in fast memory or slow memory, but not both simultaneously, the optimal strategy for tiered memory management? We demonstrate that page migration-based exclusive memory tiering suffers significant performance degradation when fast memory is under pressure. In this paper, we propose nonexclusive memory tiering, a page management strategy that retains a copy of pages recently promoted from slow memory to fast memory to mitigate memory thrashing. To enable non-exclusive memory tiering, we develop NOMAD, a new page management mechanism for Linux that features transactional page migration and page shadowing. NOMAD helps remove page migration off the critical path of program execution and makes migration completely asynchronous. Evaluations with carefully crafted micro-benchmarks and real-world applications show that NOMAD is able to achieve up to 6x performance improvement over the state-of-the-art transparent page placement (TPP) approach in Linux when under memory pressure. We also compare NOMAD with a recently proposed hardware-assisted, access sampling-based page migration approach and demonstrate NOMAD’s strengths and potential weaknesses in various scenarios.
more » « less
Full Text Available
Nomad: {Non-Exclusive} Memory Tiering via Transactional Page Migration

Xiang, Lingfeng; Lin, Zhen; Deng, Weishu; Lu, Hui; Rao, Jia Rao; Yuan, Yifan; Wang, Ren (July 2024, USENIX Association)

Full Text Available
NOMAD: Non-Exclusive Memory Tiering via Transactional Page Migration

Xinag, Lingfeng; Lin, Zhen; Deng, Weishu Deng; Lu, Hui; Rao, Jia; Yuan, Yifan Yuan; Wang, Ren (July 2024, The 18th USENIX Symposium on Operating Systems Design and Implementation (OSDI'24))

With the advent of byte-addressable memory devices, such as CXL memory, persistent memory, and storage-class memory, tiered memory systems have become a reality. Page migration is the de facto method within operating systems for managing tiered memory. It aims to bring hot data whenever possible into fast memory to optimize the performance of data accesses while using slow memory to accommodate data spilled from fast memory. While the existing research has demonstrated the effectiveness of various optimizations on page migration, it falls short of addressing a fundamental question: Is exclusive memory tiering, in which a page is either present in fast memory or slow memory, but not both simultaneously, the optimal strategy for tiered memory management? We demonstrate that page migration-based exclusive memory tiering suffers significant performance degradation when fast memory is under pressure. In this paper, we propose nonexclusive memory tiering, a page management strategy that retains a copy of pages recently promoted from slow memory to fast memory to mitigate memory thrashing. To enable non-exclusive memory tiering, we develop NOMAD, a new page management mechanism for Linux that features transactional page migration and page shadowing. NOMAD helps remove page migration off the critical path of program execution and makes migration completely asynchronous. Evaluations with carefully crafted micro-benchmarks and real-world applications show that NOMAD is able to achieve up to 6x performance improvement over the state-of-the-art transparent page placement (TPP) approach in Linux when under memory pressure. We also compare NOMAD with a recently proposed hardware-assisted, access sampling-based page migration approach and demonstrate NOMAD’s strengths and potential weaknesses in various scenarios.
more » « less
Full Text Available
P2CACHE: Exploring Tiered Memory for {In-Kernel} File Systems Caching

Lin, Zhen; Xiang, Lingfeng; Rao, Jia; Lu, Hui (July 2023, 2023 USENIX Annual Technical Conference (USENIX ATC 23))

Fast, byte-addressable persistent memory (PM) is becoming a reality in products. However, porting legacy kernel file systems to fully support PM requires substantial effort and encounters the challenge of bridging the gap between block-based access granularity and byte-addressability. Moreover, new PM-specific file systems remain far from production-ready, preventing them from being widely used. In this paper, we propose P2CACHE, a novel in-kernel caching mechanism to explore how legacy kernel file systems can effectively evolve in the face of fast, byte-addressable PM. P2CACHE exploits a read/write-distinguishable memory hierarchy upon a tiered memory system involving both PM and DRAM. P2CACHE leverages PM to serve all write requests for instant data durability and strong crash consistency while using DRAM to serve most read I/Os for high I/O performance. Further, P2CACHE employs a simple yet effective synchronization model between PM and DRAM by leveraging device-level parallelism. Our evaluation shows that P2CACHE can significantly increase the performance of legacy kernel file systems -- e.g., by 200x for RocksDB on Ext4 -- meanwhile equipping them with instant data durability and strong crash consistency, similar to PM-specialized file systems.
more » « less
Full Text Available
PVM: Efficient Shadow Paging for Deploying Secure Containers in Cloud-native Environment

https://doi.org/10.1145/3600006.3613158

Huang, Hang; Lai, Jiangshan; Rao, Jia; Lu, Hui; Hou, Wenlong; Su, Hang; Xu, Quan; Zhong, Jiang; Zeng, Jiahao; Wang, Xu; et al (October 2023, ACM)

In cloud-native environments, containers are often deployed within lightweight virtual machines (VMs) to ensure strong security isolation and privacy protection. With the growing demand for customized cloud services, third-party vendors are turning to infrastructure-as-a-service (IaaS) cloud providers to build their own cloud-native platforms, necessitating the need to run a VM or a guest that hosts containers inside another VM instance leased from an IaaS cloud. State-of-the-art nested virtualization in the x86 architecture relies heavily on the host hypervisor to expose hardware virtualization support to the guest hypervisor, not only complicating cloud management but also raising concerns about an increased attack surface at the host hypervisor. This paper presents the design and implementation of PVM, a high-performance guest hypervisor for KVM that is transparent to the host hypervisor and assumes no hardware virtualization support. PVM leverages two key designs: 1) a minimal shared memory region between the guest and guest hypervisor to facilitate state transition between different privilege levels and 2) an efficient shadow page table design to reduce the cost of memory virtualization. PVM has been adopted by a major IaaS cloud provider for hosting tens of thousands of secure containers on a daily basis. Our experiments demonstrate that PVM significantly outperforms current nested virtualization in KVM for memory virtualization, particularly for concurrent workloads, while maintaining comparable performance in CPU and I/O virtualization.
more » « less
PRISM: Streamlined Packet Processing for Containers with Flow Prioritization

Munikar, Manish; Lei, Jiaxin; Lu, Hui; Rao, Jia (July 2022, The 42nd IEEE International Conference on Distributed Computing Systems)

Full Text Available
Characterizing the performance of intel optane persistent memory: a close look at its on-DIMM buffering

https://doi.org/10.1145/3492321.3519556

Xiang, Lingfeng; Zhao, Xingsheng; Rao, Jia; Jiang, Song; Jiang, Hong (March 2022, The EuroSys conference)

Full Text Available
SwitchFlow: preemptive multitasking for deep learning

https://doi.org/10.1145/3464298.3493391

Wu, Xiaofeng; Rao, Jia; Chen, Wei; Huang, Hang; Ding, Chris; Huang, Heng (December 2021, the 22nd International Middleware Conference)

Full Text Available

« Prev Next »

Search for: All records