skip to main content

Title: Traffic-Optimal Virtual Network Function Placement and Migration in Dynamic Cloud Data Centers
We propose a new algorithmic framework for traffic-optimal virtual network function (VNF) placement and migration for policy-preserving data centers (PPDCs). As dy- namic virtual machine (VM) traffic must traverse a sequence of VNFs in PPDCs, it generates more network traffic, consumes higher bandwidth, and causes additional traffic delays than a traditional data center. We design optimal, approximation, and heuristic traffic-aware VNF placement and migration algorithms to minimize the total network traffic in the PPDC. In particular, we propose the first traffic-aware constant-factor approximation algorithm for VNF placement, a Pareto-optimal solution for VNF migration, and a suite of efficient dynamic-programming (DP)-based heuristics that further improves the approximation solution. At the core of our framework are two new graph- theoretical problems that have not been studied. Using flow characteristics found in production data centers and realistic traffic patterns, we show that a) our VNF migration techniques are effective in mitigating dynamic traffic in PPDCs, reducing the total traffic cost by up to 73%, b) our VNF placement algorithms yield traffic costs 56% to 64% smaller than those by existing techniques, and c) our VNF migration algorithms outperform the state-of-the-art VM migration algorithms by up to 63% in reducing dynamic network traffic.
Authors:
; ; ;
Award ID(s):
1911191
Publication Date:
NSF-PAR ID:
10357756
Journal Name:
36th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2022)
Page Range or eLocation-ID:
919 to 929
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract—Virtual Network Functions (VNFs) are software implementation of middleboxes (MBs) (e.g., firewalls and proxy servers) that provide performance and security guarantees for virtual machine (VM) cloud applications. In this paper, we study a new VM flow migration problem for dynamic VNF-enabled cloud data centers (VDCs). The goal is to migrate the VM flows in the dynamic VDCs to minimize the total network traffic while load-balancing VNFs with limited processing capabilities. We refer to the problem as FMDV: flow migration in dynamic VDCs. We propose an optimal and efficient minimum cost flow-based flow migration algorithm and two benefit-based efficient heuristic algorithms to solve the FMDV. Via extensive simulations, we show that our algorithms are effective in mitigating dynamic cloud traffic while achieving load balance among VNFs. In particular, all our algorithms reduce dynamic network traffic in all cases and our optimal algorithm always achieves the best traffic-mitigation effect, reducing the network traffic by up to 28% compared to the case without flow migration.
  2. Virtual Network Functions (VNFs) are software implementation of middleboxes (MBs) (e.g., firewalls) that provide performance and security guarantees for virtual machine (VM) cloud applications. In this paper we study a new flow migration problem in VNF-enabled cloud data centers where the traffic rates of VM flows are constantly changing. Our goal is to minimize the total network traffic (therefore optimizing the network resources such as bandwidth and energy) while considering that VNFs have limited processing capability. We formulate the flow migration problem and design two efficient benefit-based greedy algorithms. The simulations show that our algorithms are effective in reducing the network traffic as well as in achieving load balance among VNFs. In particular, our flow migration algorithms can reduce upto 15% network traffic compared to the case without flow migration.
  3. The high reliability required by many future-generation network services can be enforced by proper resource assignments by means of logical partitions, i.e., network slices, applied in optical metro-aggregation networks. Different strategies can be applied to deploy the virtual network functions (VNFs) composing the slices over physical nodes, while providing different levels of resource isolation (among slices) and protection against failures, based on several available techniques. Considering that, in optical metro-aggregation networks, protection can be ensured at different layers, and the slice protection with traffic grooming calls for evolved multilayer protection approaches. In this paper, we investigate the problem of reliable slicing with protection at the lightpath layer for different levels of slice isolation and different VNF deployment strategies. We model the problem through an integer linear program (ILP), and we devise a heuristic for joint optimization of VNF placement and ligthpath selection. The heuristic maps nodes and links over the physical network in a coordinated manner and provides an effective placement of radio access network functions and the routing and wavelength assignment for the optical layer. The effectiveness of the proposed heuristic is validated by comparison with the optimal solution provided by the ILP. Our illustrative numerical results compare the impactmore »of different levels of isolation, showing that higher levels of network and VNF isolation are characterized by higher costs in terms of optical and computation resources.

    « less
  4. Virtual machine (VM) replication is an effective technique in cloud data centers to achieve fault-tolerance, load-balance, and quick-responsiveness to user requests. In this paper we study a new fault-tolerant VM placement problem referred to as FT-VMP. Given that different VM has different fault-tolerance requirement (i.e., different VM requires different number of replica copies) and compatibility requirement (i.e., some VMs and their replicas cannot be placed into some physical machines (PMs) due to software or platform incompatibility), FT-VMP studies how to place VM replica copies inside cloud data centers in order to minimize the number of PMs storing VM replicas, under the constraints that i) for fault-tolerant purpose, replica copies of the same VM cannot be placed inside the same PM and ii) each PM has a limited amount of storage capacity. We first prove that FT-VMP is NP-hard. We then design an integer linear programming (ILP)-based algorithm to solve it optimally. As ILP takes time to compute thus is not suitable for large scale cloud data centers, we design a suite of efficient and scalable heuristic fault-tolerant VM placement algorithms. We show that a) ILP-based algorithm outperforms the state-of-the-art VM replica placement in a wide range of network dynamics andmore »b) that all our fault-tolerant VM placement algorithms are able to turn off significant number of PMs to save energy in cloud data centers. In particular, we show that our algorithms can consolidate (i.e., turn off) around 100 PMs in a small data center of 256 PMs and 700 PMs in a large data center of 1028PMs.« less
  5. Service function chaining (SFC), consisting of a sequence of virtual network functions (VNFs) (i.e., firewalls and load balancers), is an effective service provision technique in modern data center networks. By requiring cloud user traffic to traverse the VNFs in order, SFC im- proves the security and performance of the cloud user applications. In this paper, we study how to place an SFC inside a data center to mini- mize the network traffic of the virtual machine (VM) communication. We take a cooperative multi-agent reinforcement learning approach, wherein multiple agents collaboratively figure out the traffic-efficient route for the VM communication. Underlying the SFC placement is a fundamental graph-theoretical prob- lem called the k-stroll problem. Given a weighted graph G(V, E), two nodes s, t ∈ V , and an integer k, the k-stroll problem is to find the shortest path from s to t that visits at least k other nodes in the graph. Our work is the first to take a multi-agent learning approach to solve k- stroll problem. We compare our learning algorithm with an optimal and exhaustive algorithm and an existing dynamic programming(DP)-based heuristic algorithm. We show that our learning algorithm, although lack- ing the complete knowledge ofmore »the network assumed by existing research, delivers comparable or even better VM communication time while taking two orders of magnitude of less execution time.« less