skip to main content


Title: The hidden cost of the edge: a performance comparison of edge and cloud latencies
Edge computing has emerged as a popular paradigm for running latency-sensitive applications due to its ability to offer lower network latencies to end-users. In this paper, we argue that despite its lower network latency, the resource-constrained nature of the edge can result in higher end-to-end latency, especially at higher utilizations, when compared to cloud data centers. We study this edge performance inversion problem through an analytic comparison of edge and cloud latencies and analyze conditions under which the edge can yield worse performance than the cloud. To verify our analytic results, we conduct a detailed experimental comparison of the edge and the cloud latencies using a realistic application and real cloud workloads. Both our analytical and experimental results show that even at moderate utilizations, the edge queuing delays can offset the benefits of lower network latencies, and even result in performance inversion where running in the cloud would provide superior latencies. We finally discuss practical implications of our results and provide insights into how application designers and service providers should design edge applications and systems to avoid these pitfalls.  more » « less
Award ID(s):
1908536 1836752 2105494
NSF-PAR ID:
10356561
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
International Conference for High Performance Computing, Networking, Storage and Analysis
Page Range / eLocation ID:
1 to 12
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Edge cloud solutions that bring the cloud closer to the sensors can be very useful to meet the low latency requirements of many Internet-of-Things (IoT) applications. However, IoT traffic can also be intermittent, so running applications constantly can be wasteful. Therefore, having a serverless edge cloud that is responsive and provides low-latency features is a very attractive option for a resource and cost-efficient IoT application environment.In this paper, we discuss the key components needed to support IoT traffic in the serverless edge cloud and identify the critical challenges that make it difficult to directly use existing serverless solutions such as Knative, for IoT applications. These include overhead from heavyweight components for managing the overall system and software adaptors for communication protocol translation used in off-the-shelf serverless platforms that are designed for large-scale centralized clouds. The latency imposed by ‘cold start’ is a further deterrent.To address these challenges we redesign several components of the Knative serverless framework. We use a streamlined protocol adaptor to leverage the MQTT IoT protocol in our serverless framework for IoT event processing. We also create a novel, event-driven proxy based on the extended Berkeley Packet Filter (eBPF), to replace the regular heavyweight Knative queue proxy. Our preliminary experimental results show that the event-driven proxy is a suitable replacement for the queue proxy in an IoT serverless environment and results in lower CPU usage and a higher request throughput. 
    more » « less
  2. Edge computing is an emerging paradigm whose goal is to boost with cloud resources available at the edge the computational capability of otherwise weak devices. This paradigm is mostly attractive to reduce user perceived latency. A central mechanism in edge computing is cyber-foraging, i.e., the search and delegation to capable edge cloud processes of tasks too complex, time consuming or resource intensive to be running on user devices or low-latency demanding to be running remotely, as a form of edge function. An edge function is any network or device-specific process that may be run on an edge process instead. Despite the recent interest for this technology from industry and academia, cyber-foraging techniques and protocols have yet to be standardized. In this paper, we leverage decomposition theory to propose an architecture providing insights in the design and implementation of protocols for cyber-foraging of multiple edge functions. In contrast with several existing solutions, we argue that the (distributed) cyber-foraging orchestration should be policy-based and not an ad-hoc solution, i.e., either a pure edge cloud burden or a device decision. To this end, via simulations, we show how our approach can be used by edge computing providers and application programmers to compare and evaluate different alternative cyber-foraging solutions. Our decomposition-based approach has general applicability to other network utility maximization problems, even outside the edge computing domain. 
    more » « less
  3. null (Ed.)
    Edge clouds can provide very responsive services for end-user devices that require more significant compute capabilities than they have. But edge cloud resources such as CPUs and accelerators such as GPUs are limited and must be shared across multiple concurrently running clients. However, multiplexing GPUs across applications is challenging. Further, edge servers are likely to require considerable amounts of streaming data to be processed. Getting that data from the network stream to the GPU can be a bottleneck, limiting the amount of work GPUs do. Finally, the lack of prompt notification of job completion from GPU also results in ineffective GPU utilization. We propose a framework that addresses these challenges in the following manner. We utilize spatial sharing of GPUs to multiplex the GPU more efficiently. While spatial sharing of GPU can increase GPU utilization, the uncontrolled spatial sharing currently available with state-of-the-art systems such as CUDA-MPS can cause interference between applications, resulting in unpredictable latency. Our framework utilizes controlled spatial sharing of GPU, which limits the interference across applications. Our framework uses the GPU DMA engine to offload data transfer to GPU, therefore preventing CPU from being bottleneck while transferring data from the network to GPU. Our framework uses the CUDA event library to have timely, low overhead GPU notifications. Preliminary experiments show that we can achieve low DNN inference latency and improve DNN inference throughput by a factor of ∼1.4. 
    more » « less
  4. Many have predicted the future of the Web to be the integration of Web content with the real-world through technologies such as Augmented Reality (AR). This has led to the rise of Extended Reality (XR) Web Browsers used to shorten the long AR application development and deployment cycle of native applications especially across different platforms. As XR Browsers mature, we face new challenges related to collaborative and multi-user applications that span users, devices, and machines. These collaborative XR applications require: (1) networking support for scaling to many users, (2) mechanisms for content access control and application isolation, and (3) the ability to host application logic near clients or data sources to reduce application latency. In this paper, we present the design and evaluation of the AR Edge Networking Architecture (ARENA) which is a platform that simplifies building and hosting collaborative XR applications on WebXR capable browsers. ARENA provides a number of critical components including: a hierarchical geospatial directory service that connects users to nearby servers and content, a token-based authentication system for controlling user access to content, and an application/service runtime supervisor that can dispatch programs across any network connected device. All of the content within ARENA exists as endpoints in a PubSub scene graph model that is synchronized across all users. We evaluate ARENA in terms of client performance as well as benchmark end-to-end response-time as load on the system scales. We show the ability to horizontally scale the system to Internet-scale with scenes containing hundreds of users and latencies on the order of tens of milliseconds. Finally, we highlight projects built using ARENA and showcase how our approach dramatically simplifies collaborative multi-user XR development compared to monolithic approaches. 
    more » « less
  5. The rapid growth in technology and wide use of internet has increased smart applications such as intelligent transportation control system, and Internet of Things, which heavily rely on an efficient and reliable connectivity network. To overcome high bandwidth work load on the network, as well as minimize latency for real-time applications, the computation can be moved from the central cloud to a distributed edge cloud. The edge computing benefits various smart applications that uses distributed network for data analytics and services. Different from the existing cloud management solutions, edge computing needs to move cloud management services towards distributed heterogeneous edge nodes for multi-tenant user applications. However, existing cloud management services do not offer remote deployment of multi-tenant user applications on the cloud of edge nodes. In this paper, we propose a practical edge cloud software framework for deploying multi-tenant distributed smart applications. Having multiple distributed end nodes, auto discovery of all active end nodes is required for deploying multi-tenant user applications. However, existing cloud solutions require either private network or fixed IP address, which is not achievable for the distributed edge nodes. Most of the edge nodes connected through the public internet without fixed IP, and some of them even connect through IEEE 802.15 based sensor networks. We propose to build a software platform to manage the distributed edge nodes as well as support services to deploy and launch isolated, multi-tenant user applications through a lightweight container. We propose an architectural solution to remotely access edge cloud management services through intermittent internet connections. We open sourced our whole set of software solutions, and analyzed the major performance metrics of the edge cloud platform. 
    more » « less