Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Free, publicly-accessible full text available July 10, 2025
-
With the advent of byte-addressable memory devices, such as CXLmemory, persistent memory, and storage-class memory, tiered memory systems have become a reality. Page migration is the de facto method within operating systems for managing tiered memory. It aims to bring hot data whenever possible into fast memory to optimize the performance of data accesses while using slow memory to accommodate data spilled from fast memory. While the existing research has demonstrated the effectiveness of various optimizations on page migration, it falls short of addressing a fundamental question: Is exclusive memory tiering, in which a page is either present in fast memory or slow memory, but not both simultaneously, the optimal strategy for tiered memory management? We demonstrate that page migration-based exclusive memory tiering suffers significant performance degradation when fast memory is under pressure. In this paper, we propose nonexclusive memory tiering, a page management strategy that retains a copy of pages recently promoted from slow memory to fast memory to mitigate memory thrashing. To enable non-exclusive memory tiering, we develop NOMAD, a new page management mechanism for Linux that features transactional page migration and page shadowing. NOMAD helps remove page migration off the critical path of program execution and makes migration completely asynchronous. Evaluations with carefully crafted micro-benchmarks and real-world applications show that NOMAD is able to achieve up to 6x performance improvement over the state-of-the-art transparent page placement (TPP) approach in Linux when under memory pressure. We also compare NOMAD with a recently proposed hardware-assisted, access sampling-based page migration approach and demonstrate NOMAD’s strengths and potential weaknesses in various scenarios.more » « lessFree, publicly-accessible full text available July 10, 2025
-
With the advent of byte-addressable memory devices, such as CXL memory, persistent memory, and storage-class memory, tiered memory systems have become a reality. Page migration is the de facto method within operating systems for managing tiered memory. It aims to bring hot data whenever possible into fast memory to optimize the performance of data accesses while using slow memory to accommodate data spilled from fast memory. While the existing research has demonstrated the effectiveness of various optimizations on page migration, it falls short of addressing a fundamental question: Is exclusive memory tiering, in which a page is either present in fast memory or slow memory, but not both simultaneously, the optimal strategy for tiered memory management? We demonstrate that page migration-based exclusive memory tiering suffers significant performance degradation when fast memory is under pressure. In this paper, we propose nonexclusive memory tiering, a page management strategy that retains a copy of pages recently promoted from slow memory to fast memory to mitigate memory thrashing. To enable non-exclusive memory tiering, we develop NOMAD, a new page management mechanism for Linux that features transactional page migration and page shadowing. NOMAD helps remove page migration off the critical path of program execution and makes migration completely asynchronous. Evaluations with carefully crafted micro-benchmarks and real-world applications show that NOMAD is able to achieve up to 6x performance improvement over the state-of-the-art transparent page placement (TPP) approach in Linux when under memory pressure. We also compare NOMAD with a recently proposed hardware-assisted, access sampling-based page migration approach and demonstrate NOMAD’s strengths and potential weaknesses in various scenarios.more » « lessFree, publicly-accessible full text available July 10, 2025
-
Fast, byte-addressable persistent memory (PM) is becoming a reality in products. However, porting legacy kernel file systems to fully support PM requires substantial effort and encounters the challenge of bridging the gap between block-based access granularity and byte-addressability. Moreover, new PM-specific file systems remain far from production-ready, preventing them from being widely used. In this paper, we propose P2CACHE, a novel in-kernel caching mechanism to explore how legacy kernel file systems can effectively evolve in the face of fast, byte-addressable PM. P2CACHE exploits a read/write-distinguishable memory hierarchy upon a tiered memory system involving both PM and DRAM. P2CACHE leverages PM to serve all write requests for instant data durability and strong crash consistency while using DRAM to serve most read I/Os for high I/O performance. Further, P2CACHE employs a simple yet effective synchronization model between PM and DRAM by leveraging device-level parallelism. Our evaluation shows that P2CACHE can significantly increase the performance of legacy kernel file systems -- e.g., by 200x for RocksDB on Ext4 -- meanwhile equipping them with instant data durability and strong crash consistency, similar to PM-specialized file systems.more » « less
-
Overlay networks serve as the de facto network virtualization technique for providing connectivity among distributed containers. Despite the flexibility in building customized private container networks, overlay networks incur significant performance loss compared to physical networks (i.e., the native). The culprit lies in the inclusion of multiple network processing stages in overlay networks, which prolongs the network processing path and overloads CPU cores. In this paper, we propose mFlow, a novel packet steering approach to parallelize the in-kernel data path of network flows. mFlow exploits packet-level parallelism in the kernel network stack by splitting the packets of the same flow into multiple micro-flows, which can be processed in parallel on multiple cores. mFlow devises new, generic mechanisms for flow splitting while preserving in-order packet delivery with little overhead. Our evaluation with both micro-benchmarks and real-world applications demonstrates the effectiveness of mFlow, with significantly improved performance – e.g., by 81% in TCP throughput and 139% in UDP compared to vanilla overlay networks. mFlow even achieved higher TCP throughput than the native (e.g., 29.8 vs. 26.6 Gbps).more » « less
-
In cloud-native environments, containers are often deployed within lightweight virtual machines (VMs) to ensure strong security isolation and privacy protection. With the growing demand for customized cloud services, third-party vendors are turning to infrastructure-as-a-service (IaaS) cloud providers to build their own cloud-native platforms, necessitating the need to run a VM or a guest that hosts containers inside another VM instance leased from an IaaS cloud. State-of-the-art nested virtualization in the x86 architecture relies heavily on the host hypervisor to expose hardware virtualization support to the guest hypervisor, not only complicating cloud management but also raising concerns about an increased attack surface at the host hypervisor. This paper presents the design and implementation of PVM, a high-performance guest hypervisor for KVM that is transparent to the host hypervisor and assumes no hardware virtualization support. PVM leverages two key designs: 1) a minimal shared memory region between the guest and guest hypervisor to facilitate state transition between different privilege levels and 2) an efficient shadow page table design to reduce the cost of memory virtualization. PVM has been adopted by a major IaaS cloud provider for hosting tens of thousands of secure containers on a daily basis. Our experiments demonstrate that PVM significantly outperforms current nested virtualization in KVM for memory virtualization, particularly for concurrent workloads, while maintaining comparable performance in CPU and I/O virtualization.more » « less
-
Rapid and accurate detection of the pathogens, such as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) for COVID-19, is critical for mitigating the COVID-19 pandemic. Current state-of-the-art pathogen tests for COVID-19 diagnosis are done in a liquid medium and take 10–30 min for rapid antigen tests and hours to days for polymerase chain reaction (PCR) tests. Herein we report novel accurate pathogen sensors, a new test method, and machine-learning algorithms for a breathalyzer platform for fast detection of SARS-CoV-2 virion particles in the aerosol in 30 s. The pathogen sensors are based on a functionalized molecularly-imprinted polymer, with the template molecules being the receptor binding domain spike proteins for different variants of SARS-CoV-2. Sensors are tested in the air and exposed for 10 s to the aerosols of various types of pathogens, including wild-type, D614G, alpha, delta, and omicron variant SARS-CoV-2, BSA (Bovine serum albumin), Middle East respiratory syndrome–related coronavirus (MERS-CoV), influenza, and wastewater samples from local sewage. Our low-cost, fast-responsive pathogen sensors yield accuracy above 99% with a limit-of-detection (LOD) better than 1 copy/μL for detecting the SARS-CoV-2 virus from the aerosol. The machine-learning algorithm supporting these sensors can accurately detect the pathogens, thereby enabling a new and unique breathalyzer platform for rapid COVID-19 tests with unprecedented speeds.more » « less
-
Fast provisioning of serverless functions is salient for serverless platforms. Though lightweight sandboxes (e.g., containers) enclose only necessary files and libraries, a cold launch still requires up to a few seconds to complete. Such slow provisioning prolongs the response time of serverless functions and negatively impacts users’ experiences. This paper analyzes the main reasons for such slowdown and introduces an effective containerization framework, FlashCube. Instead of building a container from scratch, FlashCube quickly and eff iciently assembles it through a group of pre-created general container parts (e.g., namespaces, cgroups, and language runtimes). In addition, FlashCube’s user-space implementation makes it easily applicable to existing commodity serverless platforms. Our preliminary evaluation demonstrates that FlashCube can quickly provision containerized functions in less than 10 ms (vs. ∼400 ms using Docker containers).more » « less