skip to main content


Title: Efficient service handoff across edge servers via docker container migration
Supporting smooth movement of mobile clients is important when offloading services on an edge computing platform. Interruption free client mobility demands seamless migration of the offloading service to nearby edge servers. However, fast migration of offloading services across edge servers in a WAN environment poses significant challenges to the handoff service design. In this paper, we present a novel service handoff system which seamlessly migrates offloading services to the nearest edge server, while the mobile client is moving. Service handoff is achieved via container migration. We identify an important performance problem during Docker container migration. Based on our systematic study of container layer management and image stacking, we propose a migration method which leverages the layered storage system to reduce file system synchronization overhead, without dependence on the distributed file system. We implement a prototype system and conduct experiments using real world product applications. Evaluation results reveal that compared to state-of-the-art service handoff systems designed for edge computing platforms, our system reduces the total duration of service handoff time by 80% (56%) with network bandwidth 5Mbps (20Mbps).  more » « less
Award ID(s):
1741635
NSF-PAR ID:
10077232
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
ACM/IEEE Symposium on Edge Computing
Page Range / eLocation ID:
1 to 13
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Emerging Edge Computing (EC) technology has shown promise for many delay-sensitive Deep Learning (DL) based applications of smart cities in terms of improved Quality-of-Service (QoS). EC requires judicious decisions which jointly consider the limited capacity of the edge servers and provided QoS of DL-dependent services. In a smart city environment, tasks may have varying priorities in terms of when and how to serve them; thus, priorities of the tasks have to be considered when making resource management decisions. In this paper, we focus on finding optimal offloading decisions in a three-tier user-edge-cloud architecture while considering different priority classes for the DL-based services and making a trade-off between a task’s completion time and the provided accuracy by the DL-based service. We cast the optimization problem as an Integer Linear Program (ILP) where the objective is to maximize a function called gain of system (GoS) defined based on provided QoS and priority of the tasks. We prove the problem is NP-hard. We then propose an efficient offloading algorithm, called PGUS, that is shown to achieve near-optimal results in terms of the provided GoS. Finally, we compare our proposed algorithm, PGUS, with heuristics and a state-of-the-art algorithm, called GUS, using both numerical analysis and real-world implementation. Our results show that PGUS outperforms GUS by a factor of 45% in average in terms of serving the top 25% higher priority classes of the tasks while still keeping the overall percentage of the dropped tasks minimal and the overall gain of system maximized. 
    more » « less
  2. Task offloading in edge computing infrastructure remains a challenge for dynamic and complex environments, such as Industrial Internet-of-Things. The hardware resource constraints of edge servers must be explicitly considered to ensure that system resources are not overloaded. Many works have studied task offloading while focusing primarily on ensuring system resilience. However, in the face of deep learning-based services, model performance with respect to loss/accuracy must also be considered. Deep learning services with different implementations may provide varying amounts of loss/accuracy while also being more complex to run inference on. That said, communication latency can be reduced to improve overall Quality-of-Service by employing compression techniques. However, such techniques can also have the side-effect of reducing the loss/accuracy provided by deep learning-based service. As such, this work studies a joint optimization problem for task offloading decisions in 3-tier edge computing platforms where decisions regarding task offloading are made in tandem with compression decisions. The objective is to optimally offload requests with compression such that the trade-off between latency-accuracy is not greatly jeopardized. We cast this problem as a mixed integer nonlinear program. Due to its nonlinear nature, we then decompose it into separate subproblems for offloading and compression. An efficient algorithm is proposed to solve the problem. Empirically, we show that our algorithm attains roughly a 0.958-approximation of the optimal solution provided by a block coordinate descent method for solving the two sub-problems back-to-back. 
    more » « less
  3. Cyber foraging techniques have been proposed in edge computing to support resource-intensive and latency-sensitive mobile applications. In a natural or man-made disaster scenario, all cyber foraging challenges are exacerbated by two problems: edge nodes are scarce and hence easily overloaded and failures are common due to the ad-hoc hostile conditions. In this paper, we study the use of efficient load profiling and migration strategies to mitigate such problems. In particular, we propose FORMICA, an architecture for cyber foraging orchestration, whose goal is to minimize the completion time of a set of jobs offloaded from mobile devices. Existing service offloading solutions are mainly concerned with outsourcing a job out of the mobile responsibility. Our architecture supports both mobile-based offloading and backend-driven onloading i.e., the offloading decision is taken by the edge infrastructure and not by the mobile node. FORMICA leverages Gelenbe networks to estimate the load profile of each node of the edge computing infrastructure to make proactive load profiling decisions. Our evaluation on a proof-of-concept implementation shows the benefits of our policy-based architecture in several (challenged disaster) scenarios but its applicability is broad to other IoT-based latency-sensitive applications. 
    more » « less
  4. Edge computing has enabled a large set of emerging edge applications by exploiting data proximity and offloading computation-intensive workloads to nearby edge servers. However, supporting edge application users at scale poses challenges due to limited point-of-presence edge sites and constrained elasticity. In this paper, we introduce a densely-distributed edge resource model that leverages capacity-constrained volunteer edge nodes to support elastic computation offloading. Our model also enables the use of geo-distributed edge nodes to further support elasticity. Collectively, these features raise the issue of edge selection. We present a distributed edge selection approach that relies on client-centric views of available edge nodes to optimize average end-to-end latency, with considerations of system heterogeneity, resource contention and node churn. Elasticity is achieved by fine-grained performance probing, dynamic load balancing, and proactive multi-edge node connections per client. Evaluations are conducted in both real-world volunteer environments and emulated platforms to show how a common edge application, namely AR-based cognitive assistance, can benefit from our approach and deliver low-latency responses to distributed users at scale. 
    more » « less
  5. Deep neural networks (DNNs) are being applied to various areas such as computer vision, autonomous vehicles, and healthcare, etc. However, DNNs are notorious for their high computational complexity and cannot be executed efficiently on resource constrained Internet of Things (IoT) devices. Various solutions have been proposed to handle the high computational complexity of DNNs. Offloading computing tasks of DNNs from IoT devices to cloud/edge servers is one of the most popular and promising solutions. While such remote DNN services provided by servers largely reduce computing tasks on IoT devices, it is challenging for IoT devices to inspect whether the quality of the service meets their service level objectives (SLO) or not. In this paper, we address this problem and propose a novel approach named QIS (quality inspection sampling) that can efficiently inspect the quality of the remote DNN services for IoT devices. To realize QIS, we design a new ID-generation method to generate data (IDs) that can identify the serving DNN models on edge servers. QIS inserts the IDs into the input data stream and implements sampling inspection on SLO violations. The experiment results show that the QIS approach can reliably inspect, with a nearly 100% success rate, the service qualtiy of remote DNN services when the SLA level is 99.9% or lower at the cost of only up to 0.5% overhead. 
    more » « less