skip to main content


Title: GranularNF: Granular Decomposition of Stateful NFV at 100 Gbps Line Speed and Beyond
In this paper, we consider the challenges that arise from the need to scale virtualized network functions (VNFs) at 100 Gbps line speed and beyond. Traditional VNF designs are monolithic in state management and scheduling: internally maintaining all states and operations associated with them. Without proper design considerations, it suffers from limitations when scaling at 100 Gbps link speed and beyond: the inability of efficient utilization of the cache because of the contention due to the frequent control plane activities, computational/memory-intensive tasks taking up CPU times, shares states causing the synchronization among the cores. We address these limitations by arguing for the need to granularly decompose a VNF into data/control components that are co-located within a server but can be independently scaled among the cores. To realize the approach, we design a "serverless" programming framework with novel abstraction to optimize the data components that must process packets at the line speed, reduce the contention of the data states and enable run-time scheduling of different components for improved resource utilization. The abstractions, combined with the runtime system that we design, help NFV developers focus on the logic and correctness of VNF programming without worrying about how VNFs may be scaled in or out. We evaluate our platform by comparing it with monolithic approaches using different workloads and by analyzing its advantages of separation on scalability, performance determinism, and feature velocity.  more » « less
Award ID(s):
2106771 1815621 2045478 1931208
NSF-PAR ID:
10356664
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
ACM SIGMETRICS Performance Evaluation Review
Volume:
50
Issue:
2
ISSN:
0163-5999
Page Range / eLocation ID:
46 to 51
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Network function virtualization (NFV) offers the potential for both enhancing service delivery flexibility and reducing overall costs by virtualizing network functions that are traditionally implemented in dedicated hardware. However, the flexibility of NFV comes with considerable compromises since virtual machine carried functions could introduce significant performance overhead. In this paper, we present a novel high-performance framework called HYPER, which combines programmable hardware infrastructure and traditional software infrastructure in NFV to achieve both high performance and flexibility for supporting virtualized network functions (VNFs). In HYPER, we design a mediator layer to hide underlying infrastructure heterogeneity from the NFV orchestrator to simplify VNF management. In addition, we design a SLA-aware service chaining algorithm in HYPER to leverage the benefits of the hybrid infrastructure to fulfill both functional and performance requirements from service subscribers (or tenants). To optimize resource utilization efficiency, we also introduce a performance-aware VNF placement algorithm in HYPER, which accommodates both resource and performance requirements in placing VNFs. We implement HYPER in a testbed based on OpenStack and ONetCard. Experimental results show that HYPER reduces the forwarding latency of a service chain by 40% to 67% compared with data plane development kit -based implementation, while maintaining the flexibility of VNF management. 
    more » « less
  2. Virtual switches, used for end-host networking, drop packets when the receiving application is not fast enough to consume them. This is called the slow receiver problem, and it is important because packet loss hurts tail communication latency and wastes CPU cycles, resulting in application-level performance degradation. Further, solving this problem is challenging because application throughput is highly variable over short timescales as it depends on workload, memory contention, and OS thread scheduling. This paper presents Backdraft, a new lossless virtual switch that addresses the slow receiver problem by combining three new components: (1) Dynamic Per-Flow Queuing (DPFQ) to prevent HOL blocking and provide on-demand memory usage; (2) Doorbell queues to reduce CPU overheads; (3) A new overlay network to avoid congestion spreading. We implemented Backdraft on top of BESS and conducted experiments with real applications on a 100 Gbps cluster with both DCTCP and Homa, a state-of-the-art congestion control scheme. We show that an application with Backdraft can achieve up to 20x lower tail latency at the 99th percentile. 
    more » « less
  3. null (Ed.)
    Virtual Network Functions (VNFs) are software implementation of middleboxes (MBs) (e.g., firewalls) that provide performance and security guarantees for virtual machine (VM) cloud applications. In this paper we study a new flow migration problem in VNF-enabled cloud data centers where the traffic rates of VM flows are constantly changing. Our goal is to minimize the total network traffic (therefore optimizing the network resources such as bandwidth and energy) while considering that VNFs have limited processing capability. We formulate the flow migration problem and design two efficient benefit-based greedy algorithms. The simulations show that our algorithms are effective in reducing the network traffic as well as in achieving load balance among VNFs. In particular, our flow migration algorithms can reduce upto 15% network traffic compared to the case without flow migration. 
    more » « less
  4. We propose a new algorithmic framework for traffic-optimal virtual network function (VNF) placement and migration for policy-preserving data centers (PPDCs). As dy- namic virtual machine (VM) traffic must traverse a sequence of VNFs in PPDCs, it generates more network traffic, consumes higher bandwidth, and causes additional traffic delays than a traditional data center. We design optimal, approximation, and heuristic traffic-aware VNF placement and migration algorithms to minimize the total network traffic in the PPDC. In particular, we propose the first traffic-aware constant-factor approximation algorithm for VNF placement, a Pareto-optimal solution for VNF migration, and a suite of efficient dynamic-programming (DP)-based heuristics that further improves the approximation solution. At the core of our framework are two new graph- theoretical problems that have not been studied. Using flow characteristics found in production data centers and realistic traffic patterns, we show that a) our VNF migration techniques are effective in mitigating dynamic traffic in PPDCs, reducing the total traffic cost by up to 73%, b) our VNF placement algorithms yield traffic costs 56% to 64% smaller than those by existing techniques, and c) our VNF migration algorithms outperform the state-of-the-art VM migration algorithms by up to 63% in reducing dynamic network traffic. 
    more » « less
  5. Future networks have to accommodate an increase of 3-4 orders of magnitude in data rates with very heterogeneous session sizes and sometimes with strict time deadline requirements.The dynamic nature of scheduling of large transactions and the need for rapid actions by the Network Management and Control (NMC) system, require timely collection of network state information. Rough estimates of the size of detailed network states suggest large volumes of data with refresh rates commensurate with the coherence time of the states (can be as fast as 100 ms), resulting in huge burden and cost for the network transport (300 Gbps/link) and computation resources. Thus, judicious sampling of network states is necessary for a cost-effective network management system. In this paper, we consider a construct of an NMC system where sensing and routing decisions are made with cognitive understanding of the network states and short-term behavior of exogenous offered traffic. We have studied a small but realistic example of adaptive monitoring based on significant sampling techniques. This technique balances the need for accurate and updated state information against the updating cost and provides an algorithm that yields near optimum performance with significantly reduced burden of sampling, transport and computation. We show that our adaptive monitoring system can reduce the NMC overhead by a factor of 100 in one example. The essential spirit of the cognitive NMC is that it collects network states ONLY when they matter to the network performance. 
    more » « less