skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Contention-Aware Performance Prediction For Virtualized Network Functions
At the core of Network Functions Virtualization lie Network Functions (NFs) that run co-resident on the same server, contend over its hardware resources and, thus, might suffer from reduced performance relative to running alone on the same hardware. Therefore, to efficiently manage resources and meet performance SLAs, NFV orchestrators need mechanisms to predict contention-induced performance degradation. In this work, we find that prior performance prediction frameworks suffer from poor accuracy on modern architectures and NFs because they treat memory as a monolithic whole. In addition, we show that, in practice, there exist multiple components of the memory subsystem that can separately induce contention. By precisely characterizing (1) the pressure each NF applies on the server's shared hardware resources (contentiousness) and (2) how susceptible each NF is to performance drop due to competing contentiousness (sensitivity), we develop SLOMO, a multivariable performance prediction framework for Network Functions. We show that relative to prior work SLOMO reduces prediction error by 2-5x and enables 6-14% more efficient cluster utilization. SLOMO's codebase can be found at https://github.com/cmu-snap/SLOMO.  more » « less
Award ID(s):
1700521
PAR ID:
10180451
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
SIGCOMM '20: Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication
Page Range / eLocation ID:
270 to 282
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Ensuring high scalability (elastic scale-out and consolidation), as well as high availability (failure resiliency) are critical in encouraging adoption of software-based network functions (NFs). In recent years, two paradigms have evolved in terms of the way the NFs manage their state - namely the Stateful (state is coupled with the NF instance) and a Stateless (state is externalized to a datastore) manner. These two paradigms present unique challenges and opportunities for ensuring high scalability and high availability of NFs and NF chains. In this work, we assess the impact on ensuring the correctness of NF state including the implications of non-determinism in packet processing, and carefully analyze and present the benefits and disadvantages of the two state management paradigms. We leverage OpenNetVM and Redis in-memory datastore to implement both state management paradigms and empirically compare the two. Although the stateless paradigm is desirable for elastic scaling, our experimental results show that, even at line-rate packet processing (10 Gbps), stateful NFs can achieve chain-level failover across servers in a LAN incurring less than 10% performance. The state-of-the-art stateless counterparts incur severe throughput penalties. We observe 30-85% overhead on normal processing, depending on the mode of state updated to the externalized datastore. 
    more » « less
  2. Ensuring high availability (HA) for software-based networks is a critical design feature that will help the adoption of software-based network functions (NFs) in production networks. It is important for NFs to avoid outages and maintain mission-critical operations. However, HA support for NFs on the critical data path can result in unacceptable performance degradation. We present REINFORCE, an integrated framework to support efficient resiliency for NFs and NF service chains. REINFORCE includes timely failure detection and consistent failover mechanisms. REINFORCE replicates state to standby NFs (local and remote) while enforcing correctness. It minimizes the number of state transfers by exploiting the concept of external synchrony, and leverages opportunistic batching and multi-buffering to optimize performance. Experimental results show that, even at line-rate packet processing (10 Gbps), REINFORCE achieves chain-level failover across servers in a LAN (or within the same node) within 10ms (100/μs), incurring less than 10% (1%) performance overhead, and adds average latency of only ~400/μs (5/μs), with a worst-case latency of less than 1ms (10/μs). 
    more » « less
  3. Elastic scaling is a central promise of NFV but has been hard to realize in practice. The difficulty arises because most Network Functions (NFs) are stateful and this state need to be shared across NF instances. Implementing state sharing while meeting the throughput and latency requirements placed on NFs is challenging and, to date, no solution exists that meets NFV’s performance goals for the full spectrum of NFs. S6 is a new framework that supports elastic scaling of NFs without compromising performance. Its design builds on the insight that a distributed shared state abstraction is well-suited to the NFV context. We organize state as a distributed shared object (DSO) space and extend the DSO concept with techniques designed to meet the need for elasticity and high-performance in NFV workloads. S6 simplifies development: NF writers program with no awareness of how state is distributed and shared. Instead, S6 transparently migrates state and handles accesses to shared state. In our evaluation, compared to recent solutions for dynamic scaling of NFs, S6 improves performance by 100x during scaling events [25], and by 2-5x under normal operation 
    more » « less
  4. 5G technology transitions the cellular network core from specialized hardware into software-based cloud-native network functions (NFs). As part of this change, the 3GPP defines an access control policy to protect NFs from one another and third-party network applications. A manual review of this policy by the 3GPP identified an over-privilege flaw that exposes cryptographic keys to all NFs. Unfortunately, such a manual review is difficult due to ambiguous documentation. In this paper, we use static program analysis to extract NF functionality from four 5G core implementations and compare that functionality to what is permissible by the 3GPP policy. We discover two previously unknown instances of over-privilege that can lead denial-of-service and extract sensitive data. We have reported our findings to the GSMA, who has confirmed the significance of these policy flaws. 
    more » « less
  5. Graphics Processing Units (GPUs) exploit large amounts of thread-level parallelism to provide high instruction throughput and to efficiently hide long-latency stalls. The resulting high throughput, along with continued programmability improvements, have made GPUs an essential computational resource in many domains. Applications from different domains can have vastly different compute and memory demands on the GPU. In a large-scale computing environment, to efficiently accommodate such wide-ranging demands without leaving GPU resources underutilized, multiple applications can share a single GPU, akin to how multiple applications execute concurrently on a CPU. Multi-application concurrency requires several support mechanisms in both hardware and software. One such key mechanism is virtual memory, which manages and protects the address space of each application. However, modern GPUs lack the extensive support for multi-application concurrency available in CPUs, and as a result suffer from high performance overheads when shared by multiple applications, as we demonstrate. We perform a detailed analysis of which multi-application concurrency support limitations hurt GPU performance the most. We find that the poor performance is largely a result of the virtual memory mechanisms employed in modern GPUs. In particular, poor address translation performance is a key obstacle to efficient GPU sharing. State-of-the-art address translation mechanisms, which were designed for single-application execution, experience significant inter-application interference when multiple applications spatially share the GPU. This contention leads to frequent misses in the shared translation lookaside buffer (TLB), where a single miss can induce long-latency stalls for hundreds of threads. As a result, the GPU often cannot schedule enough threads to successfully hide the stalls, which diminishes system throughput and becomes a first-order performance concern. Based on our analysis, we propose MASK, a new GPU framework that provides low-overhead virtual memory support for the concurrent execution of multiple applications. MASK consists of three novel address-translation-aware cache and memory management mechanisms that work together to largely reduce the overhead of address translation: (1) a token-based technique to reduce TLB contention, (2) a bypassing mechanism to improve the effectiveness of cached address translations, and (3) an application-aware memory scheduling scheme to reduce the interference between address translation and data requests. Our evaluations show that MASK restores much of the throughput lost to TLB contention. Relative to a state-of-the-art GPU TLB, MASK improves system throughput by 57.8%, improves IPC throughput by 43.4%, and reduces application-level unfairness by 22.4%. MASK's system throughput is within 23.2% of an ideal GPU system with no address translation overhead. 
    more » « less