skip to main content

Title: Service Function Chain Placement in Cloud Data Center Networks: a Cooperative Multi-Agent Reinforcement Learning Approach
Service function chaining (SFC), consisting of a sequence of virtual network functions (VNFs) (i.e., firewalls and load balancers), is an effective service provision technique in modern data center networks. By requiring cloud user traffic to traverse the VNFs in order, SFC im- proves the security and performance of the cloud user applications. In this paper, we study how to place an SFC inside a data center to mini- mize the network traffic of the virtual machine (VM) communication. We take a cooperative multi-agent reinforcement learning approach, wherein multiple agents collaboratively figure out the traffic-efficient route for the VM communication. Underlying the SFC placement is a fundamental graph-theoretical prob- lem called the k-stroll problem. Given a weighted graph G(V, E), two nodes s, t ∈ V , and an integer k, the k-stroll problem is to find the shortest path from s to t that visits at least k other nodes in the graph. Our work is the first to take a multi-agent learning approach to solve k- stroll problem. We compare our learning algorithm with an optimal and exhaustive algorithm and an existing dynamic programming(DP)-based heuristic algorithm. We show that our learning algorithm, although lack- ing the complete knowledge of the network assumed by existing research, delivers comparable or even better VM communication time while taking two orders of magnitude more » of less execution time. « less
Authors:
; ;
Award ID(s):
2131309
Publication Date:
NSF-PAR ID:
10388052
Journal Name:
11th EAI International Conference on Game Theory for Networks (GameNets 2021)
Sponsoring Org:
National Science Foundation
More Like this
  1. With the advent of Network Function Virtualization (NFV), Physical Network Functions (PNFs) are gradually being replaced by Virtual Network Functions (VNFs) that are hosted on general purpose servers. Depending on the call flows for specific services, the packets need to pass through an ordered set of network functions (physical or virtual) called Service Function Chains (SFC) before reaching the destination. Conceivably for the next few years during this transition, these networks would have a mix of PNFs and VNFs, which brings an interesting mix of network problems that are studied in this paper: (1) How to find an SFC-constrained shortest path between any pair of nodes? (2) What is the achievable SFC-constrained maximum flow? (3) How to place the VNFs such that the cost (the number of nodes to be virtualized) is minimized, while the maximum flow of the original network can still be achieved even under the SFC constraint? In this work, we will try to address such emerging questions. First, for the SFC-constrained shortest path problem, we propose a transformation of the network graph to minimize the computational complexity of subsequent applications of any shortest path algorithm. Second, we formulate the SFC-constrained maximum flow problem as a fractionalmore »multicommodity flow problem, and develop a combinatorial algorithm for a special case of practical interest. Third, we prove that the VNFs placement problem is NP-hard and present an alternative Integer Linear Programming (ILP) formulation. Finally, we conduct simulations to elucidate our theoretical results.« less
  2. We propose a new algorithmic framework for traffic-optimal virtual network function (VNF) placement and migration for policy-preserving data centers (PPDCs). As dy- namic virtual machine (VM) traffic must traverse a sequence of VNFs in PPDCs, it generates more network traffic, consumes higher bandwidth, and causes additional traffic delays than a traditional data center. We design optimal, approximation, and heuristic traffic-aware VNF placement and migration algorithms to minimize the total network traffic in the PPDC. In particular, we propose the first traffic-aware constant-factor approximation algorithm for VNF placement, a Pareto-optimal solution for VNF migration, and a suite of efficient dynamic-programming (DP)-based heuristics that further improves the approximation solution. At the core of our framework are two new graph- theoretical problems that have not been studied. Using flow characteristics found in production data centers and realistic traffic patterns, we show that a) our VNF migration techniques are effective in mitigating dynamic traffic in PPDCs, reducing the total traffic cost by up to 73%, b) our VNF placement algorithms yield traffic costs 56% to 64% smaller than those by existing techniques, and c) our VNF migration algorithms outperform the state-of-the-art VM migration algorithms by up to 63% in reducing dynamic network traffic.
  3. Abstract—Virtual Network Functions (VNFs) are software implementation of middleboxes (MBs) (e.g., firewalls and proxy servers) that provide performance and security guarantees for virtual machine (VM) cloud applications. In this paper, we study a new VM flow migration problem for dynamic VNF-enabled cloud data centers (VDCs). The goal is to migrate the VM flows in the dynamic VDCs to minimize the total network traffic while load-balancing VNFs with limited processing capabilities. We refer to the problem as FMDV: flow migration in dynamic VDCs. We propose an optimal and efficient minimum cost flow-based flow migration algorithm and two benefit-based efficient heuristic algorithms to solve the FMDV. Via extensive simulations, we show that our algorithms are effective in mitigating dynamic cloud traffic while achieving load balance among VNFs. In particular, all our algorithms reduce dynamic network traffic in all cases and our optimal algorithm always achieves the best traffic-mitigation effect, reducing the network traffic by up to 28% compared to the case without flow migration.
  4. Virtual Network Functions (VNFs) are software implementation of middleboxes (MBs) (e.g., firewalls) that provide performance and security guarantees for virtual machine (VM) cloud applications. In this paper we study a new flow migration problem in VNF-enabled cloud data centers where the traffic rates of VM flows are constantly changing. Our goal is to minimize the total network traffic (therefore optimizing the network resources such as bandwidth and energy) while considering that VNFs have limited processing capability. We formulate the flow migration problem and design two efficient benefit-based greedy algorithms. The simulations show that our algorithms are effective in reducing the network traffic as well as in achieving load balance among VNFs. In particular, our flow migration algorithms can reduce upto 15% network traffic compared to the case without flow migration.
  5. Virtual machine (VM) replication is an effective technique in cloud data centers to achieve fault-tolerance, load-balance, and quick-responsiveness to user requests. In this paper we study a new fault-tolerant VM placement problem referred to as FT-VMP. Given that different VM has different fault-tolerance requirement (i.e., different VM requires different number of replica copies) and compatibility requirement (i.e., some VMs and their replicas cannot be placed into some physical machines (PMs) due to software or platform incompatibility), FT-VMP studies how to place VM replica copies inside cloud data centers in order to minimize the number of PMs storing VM replicas, under the constraints that i) for fault-tolerant purpose, replica copies of the same VM cannot be placed inside the same PM and ii) each PM has a limited amount of storage capacity. We first prove that FT-VMP is NP-hard. We then design an integer linear programming (ILP)-based algorithm to solve it optimally. As ILP takes time to compute thus is not suitable for large scale cloud data centers, we design a suite of efficient and scalable heuristic fault-tolerant VM placement algorithms. We show that a) ILP-based algorithm outperforms the state-of-the-art VM replica placement in a wide range of network dynamics andmore »b) that all our fault-tolerant VM placement algorithms are able to turn off significant number of PMs to save energy in cloud data centers. In particular, we show that our algorithms can consolidate (i.e., turn off) around 100 PMs in a small data center of 256 PMs and 700 PMs in a large data center of 1028PMs.« less