Advances in virtualization technologies and edge computing have inspired a new paradigm for Internet-of-Things (IoT) application development. By breaking a monolithic application
into loosely coupled microservices, great gain can be achieved in performance, flexibility and robustness. In this paper, we study the important problem of load balancing across IoT
microservice instances. A key difficulty in this problem is the interdependencies among microservices: the load on a successor microservice instance directly depends on the load distributed from its predecessor microservice instances. We propose a graph-based model for describing the load dependencies among microservices. Based on the model, we first propose a basic formulation for load balancing, which can be solved optimally in polynomial time. The basic model neglects the quality-of-service (QoS) of the IoT application. We then propose a QoS-aware load balancing model, based on a novel abstraction that captures a
realization of the application’s internal logic. The QoS-aware load balancing problem is NP-hard. We propose a fully polynomialtime approximation scheme for the QoS-aware problem. We show through simulation experiments that our proposed algorithm achieves enhanced QoS compared to heuristic solutions.
more »
« less
Data-Driven Edge Resource Provisioning for Inter-Dependent Microservices with Dynamic Load
This paper studies how to provision edge computing and network resources for complex microservice-based applications (MSAs) in face of uncertain and dynamic geo-distributed demands. The complex inter-dependencies between distributed microservice components make load balancing for MSAs extremely challenging, and the dynamic geo-distributed demands exacerbate load imbalance and consequently congestion and performance loss. In this paper, we develop an edge resource provisioning model that accurately captures the inter-dependencies between microservices and their impact on load balancing across both computation and communication resources. We also propose a robust formulation that employs explicit risk estimation and optimization to hedge against potential worst-case load fluctuations, with controlled robustness-resource trade-off. Utilizing a data-driven approach, we provide a solution that provides risk estimation with measurement data of past load geo-distributions. Simulations with real-world datasets have validated that our solution provides the important robustness crucially needed in MSAs, and performs superiorly compared to baselines that neglect either network or inter-dependency constraints.
more »
« less
- PAR ID:
- 10328845
- Date Published:
- Journal Name:
- IEEE GLOBECOM
- Page Range / eLocation ID:
- 1 to 6
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Edge computing has enabled a large set of emerging edge applications by exploiting data proximity and offloading computation-intensive workloads to nearby edge servers. However, supporting edge application users at scale poses challenges due to limited point-of-presence edge sites and constrained elasticity. In this paper, we introduce a densely-distributed edge resource model that leverages capacity-constrained volunteer edge nodes to support elastic computation offloading. Our model also enables the use of geo-distributed edge nodes to further support elasticity. Collectively, these features raise the issue of edge selection. We present a distributed edge selection approach that relies on client-centric views of available edge nodes to optimize average end-to-end latency, with considerations of system heterogeneity, resource contention and node churn. Elasticity is achieved by fine-grained performance probing, dynamic load balancing, and proactive multi-edge node connections per client. Evaluations are conducted in both real-world volunteer environments and emulated platforms to show how a common edge application, namely AR-based cognitive assistance, can benefit from our approach and deliver low-latency responses to distributed users at scale.more » « less
-
This paper focuses on optimizing resource allocation amongst a set of tenants, network slices, supporting dynamic customer loads over a set of distributed resources, e.g., base stations. The aim is to reap the benefits of statistical multiplexing resulting from flexible sharing of ‘pooled’ resources, while enabling tenants to differentiate and protect their performance from one another’s load fluctuations. To that end we consider a setting where resources are grouped into Virtual Resource Pools (VRPs) wherein resource allocation is jointly and dynam- ically managed. Specifically for each VRP we adopt a Share- Constrained Proportionally Fair (SCPF) allocation scheme where each tenant is allocated a fixed share (budget). This budget is to be distributed equally amongst its active customers which in turn are granted fractions of their associated VRP resources in proportion to customer shares. For a VRP with a single resource, this translates to the well known Generalized Processor Sharing (GPS) policy. For VRPs with multiple resources SCPF provides a flexible means to achieve load elastic allocations across tenants sharing the pool. Given tenants’ per resource shares and expected loads, this paper formulates the problem of determining optimal VRP partitions which maximize the overall expected shared weighted utility while ensuring protection guarantees. For a high load/capacity setting we exhibit this network utility function explicitly, quantifying the benefits and penalties of any VRP partition, in terms of network slices’ ability to achieve performance differentiation, load balancing, and statistical multiplexing. Although the problem is shown to be NP-Hard, a simple greedy heuristic is shown to be effective. Analysis and simulations confirm that the selection of optimal VRP partitions provide a practical avenue towards improving network utility in network slicing scenarios with dynamic loads.more » « less
-
Geo-distributed Edge sites are expected to cater to the stringent demands of situation-aware applications like collaborative autonomous vehicles and drone swarms. While clients of such applications benefit from having network-proximal compute resources, an Edge site has limited resources compared to the traditional Cloud. Moreover, the load experienced by an Edge site depends on a client's mobility pattern, which may often be unpredictable. The Function-as-a-Service (FaaS) paradigm is poised aptly to handle the ephemeral nature of workload demand at Edge sites. In FaaS, applications are decomposed into containerized functions enabling fine-grained resource management. However, spatio-temporal variations in client mobility can still lead to rapid saturation of resources beyond the capacity of an Edge site.To address this challenge, we develop FEO (Federated Edge Orchestrator), a resource allocation scheme across the geodistributed Edge infrastructure for FaaS. FEO employs a novel federated policy to offload function invocations to peer sites with spare resource capacity without the need to frequently share knowledge about available capacities among participating sites. Detailed experiments show that FEO's approach can reduce a site's P99 latency by almost 3x, while maintaining application service level objectives at all other sites.more » « less
-
We investigate the use of SmartNIC-accelerated servers to execute microservice-based applications in the data center. By offloading suitable microservices to the SmartNIC’s low-power processor, we can improve server energy-efficiency without latency loss. However, as a heterogeneous computing substrate in the data path of the host, SmartNICs bring several challenges to a microservice platform: network traffic routing and load balancing, microservice placement on heterogeneous hardware, and contention on shared SmartNIC resources. We present E3, a microservice execution platform for SmartNIC-accelerated servers. E3 follows the design philosophies of the Azure Service Fabric microservice platform and extends key system components to a SmartNIC to address the above-mentioned challenges. E3 employs three key techniques: ECMP-based load balancing via SmartNICs to the host, network topology-aware microservice placement, and a data-plane orchestrator that can detect SmartNIC overload. Our E3 prototype using Cavium LiquidIO SmartNICs shows that SmartNIC offload can improve cluster energy-efficiency up to 3× and cost efficiency up to 1.9× at up to 4% latency cost for common microservices, including real-time analytics, an IoT hub, and virtual network functions.more » « less