skip to main content

Title: EdgeBalance: Model-Based Load Balancing for Network Edge Data Planes
Edge data centers are an appealing place for telecommunication providers to offer in-network processing such as VPN services, security monitoring, and 5G. Placing these network services closer to users can reduce latency and core network bandwidth, but the deployment of network functions at the edge poses several important challenges. Edge data centers have limited resource capacity, yet network functions are re-source intensive with strict performance requirements. Replicating services at the edge is needed to meet demand, but balancing the load across multiple servers can be challenging due to diverse service costs, server and flow heterogeneity, and dynamic workload conditions. In this paper, we design and implement a model-based load balancer EdgeBalance for edge network data planes. EdgeBalance predicts the CPU demand of incoming traffic and adaptively distributes flows to servers to keep them evenly balanced. We overcome several challenges specific to network processing at the edge to improve throughput and latency over static load balancing and monitoring-based approaches.
; ;
Award ID(s):
Publication Date:
Journal Name:
Workshop on Hot Topics in Edge Computing
Sponsoring Org:
National Science Foundation
More Like this
  1. We investigate the use of SmartNIC-accelerated servers to execute microservice-based applications in the data center. By offloading suitable microservices to the SmartNIC’s low-power processor, we can improve server energy-efficiency without latency loss. However, as a heterogeneous computing substrate in the data path of the host, SmartNICs bring several challenges to a microservice platform: network traffic routing and load balancing, microservice placement on heterogeneous hardware, and contention on shared SmartNIC resources. We present E3, a microservice execution platform for SmartNIC-accelerated servers. E3 follows the design philosophies of the Azure Service Fabric microservice platform and extends key system components to a SmartNIC to address the above-mentioned challenges. E3 employs three key techniques: ECMP-based load balancing via SmartNICs to the host, network topology-aware microservice placement, and a data-plane orchestrator that can detect SmartNIC overload. Our E3 prototype using Cavium LiquidIO SmartNICs shows that SmartNIC offload can improve cluster energy-efficiency up to 3× and cost efficiency up to 1.9× at up to 4% latency cost for common microservices, including real-time analytics, an IoT hub, and virtual network functions.
  2. Edge cloud data centers (Edge) are deployed to provide responsive services to the end-users. Edge can host more powerful CPUs and DNN accelerators such as GPUs and may be used for offloading tasks from end-user devices that require more significant compute capabilities. But Edge resources may also be limited and must be shared across multiple applications that process requests concurrently from several clients. However, multiplexing GPUs across applications is challenging. With edge cloud servers needing to process a lot of streaming and the advent of multi-GPU systems, getting that data from the network to the GPU can be a bottleneck, limiting the amount of work the GPU cluster can do. The lack of prompt notification of job completion from the GPU can also result in poor GPU utilization. We build on our recent work on controlled spatial sharing of a single GPU to expand to support multi-GPU systems and propose a framework that addresses these challenges. Unlike the state-of-the-art uncontrolled spatial sharing currently available with systems such as CUDA-MPS, our controlled spatial sharing approach uses each of the GPU in the cluster efficiently by removing interference between applications, resulting in much better, predictable, inference latency We also use each ofmore »the cluster GPU's DMA engines to offload data transfers to the GPU complex, thereby preventing the CPU from being the bottleneck. Finally, our framework uses the CUDA event library to give timely, low overhead GPU notifications. Our evaluations show we can achieve low DNN inference latency and improve DNN inference throughput by at least a factor of 2.« less
  3. Abstract—Virtual Network Functions (VNFs) are software implementation of middleboxes (MBs) (e.g., firewalls and proxy servers) that provide performance and security guarantees for virtual machine (VM) cloud applications. In this paper, we study a new VM flow migration problem for dynamic VNF-enabled cloud data centers (VDCs). The goal is to migrate the VM flows in the dynamic VDCs to minimize the total network traffic while load-balancing VNFs with limited processing capabilities. We refer to the problem as FMDV: flow migration in dynamic VDCs. We propose an optimal and efficient minimum cost flow-based flow migration algorithm and two benefit-based efficient heuristic algorithms to solve the FMDV. Via extensive simulations, we show that our algorithms are effective in mitigating dynamic cloud traffic while achieving load balance among VNFs. In particular, all our algorithms reduce dynamic network traffic in all cases and our optimal algorithm always achieves the best traffic-mitigation effect, reducing the network traffic by up to 28% compared to the case without flow migration.
  4. The proliferation of innovative mobile services such as augmented reality, networked gaming, and autonomous driving has spurred a growing need for low-latency access to computing resources that cannot be met solely by existing centralized cloud systems. Mobile Edge Computing (MEC) is expected to be an effective solution to meet the demand for low-latency services by enabling the execution of computing tasks at the network-periphery, in proximity to end-users. While a number of recent studies have addressed the problem of determining the execution of service tasks and the routing of user requests to corresponding edge servers, the focus has primarily been on the efficient utilization of computing resources, neglecting the fact that non-trivial amounts of data need to be stored to enable service execution, and that many emerging services exhibit asymmetric bandwidth requirements. To fill this gap, we study the joint optimization of service placement and request routing in MEC-enabled multi-cell networks with multidimensional (storage-computation-communication) constraints. We show that this problem generalizes several problems in literature and propose an algorithm that achieves close-to-optimal performance using randomized rounding. Evaluation results demonstrate that our approach can effectively utilize the available resources to maximize the number of requests served by low-latency edge cloud servers.
  5. New breed of applications, such as autonomous driving and their need for computation-aided quick decision making has motivated the delegation of compute-intensive services (e.g., video analytic) to the more powerful surrogate machines at the network edge–edge computing (EC). Recently, the notion of pervasive edge computing (PEC) has emerged, in which users’ devices can join the pool of the computing resources that perform edge computing. Inclusion of users’ devices increases the computing capability at the edge (adding to the infrastructure servers), but in comparison to the conventional edge ecosystems, it also introduces new challenges, such as service orchestration (i.e., service placement, discovery, and migration). We propose uDiscover, a novel user-driven service discovery and utilization framework for the PEC ecosystem. In designing uDiscover, we considered the Named-Data Networking architecture for balancing users workloads and reducing user-perceived latency. We propose proactive and reactive service discovery approaches and assess their performance in PEC and infrastructure-only ecosystems. Our simulation results show that (i) the PEC ecosystem reduces the user-perceived delays by up to 70%, and (ii) uDiscover selects the most suitable server–"accurate" delay estimates with less than 10% error–to execute any given task.