Serverless computing is an emerging event-driven programming model that accelerates the development and deployment of scalable web services on cloud computing systems. Though widely integrated with the public cloud, serverless computing use is nascent for edge-based, IoT deployments. In this work, we design and develop STOIC (Serverless TeleOperable HybrId Cloud), an IoT application deployment and offloading system that extends the serverless model in three ways. First, STOIC adopts a dynamic feedback control mechanism to precisely predict latency and dispatch workloads uniformly across edge and cloud systems using a distributed serverless framework. Second, STOIC leverages hardware acceleration (e.g. GPU resources) for serverless function execution when available from the underlying cloud system. Third, STOIC can be configured in multiple ways to overcome deployment variability associated with public cloud use. Finally, we empirically evaluate STOIC using real-world machine learning applications and multi-tier IoT deployments (edge and cloud). We show that STOIC can be used for training image processing workloads (for object recognition) – once thought too resource intensive for edge deployments. We find that STOIC reduces overall execution time (response latency) and achieves placement accuracy that ranges from 92% to 97%.
more »
« less
Characterizing task completion latencies in multi-point multi-quality fog computing systems
Fog computing, which distributes computing resources to multiple locations between the Internet of Things (IoT) devices and the cloud, is attracting considerable attention from academia and industry. Yet, despite the excitement about the potential of fog computing, few comprehensive studies quantitatively characterizing the properties of fog computing architectures have been conducted. In this paper we examine the statistical properties of fog computing task completion latencies, which are important to understand to develop algorithms that match IoT nodes’ tasks with the best execution points within the fog computing substrate. Towards characterizing task completion latencies, we developed and deployed a set of benchmarks in 6 different locations, which included local nodes of different grades, conventional cloud computing services in two different regions, and Amazon Web Services (AWS) and Microsoft Azure serverless computing options. Using the developed infrastructure, we conducted a series of targeted experiments with a node invoking our benchmarks from different locations and in different conditions. The empirical study elucidated several important properties of task execution latencies, including latency variation across different execution points and execution options, and stability with respect to time. The study also demonstrated important properties of serverless execution options, and showed that statistical structure of computing latencies can be accurately characterized based on a small number (only 10–50) of latency samples. The complete measurement set we have captured as part of this study is publicly available.
more »
« less
- PAR ID:
- 10195326
- Date Published:
- Journal Name:
- Computer networks
- Volume:
- 181
- Issue:
- 9
- ISSN:
- 1389-1286
- Page Range / eLocation ID:
- 107526
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Cloud computing has grown because of lowered costs due to economies of scale and multiplexing. Serverless computing exploits multiplexing in cloud computing however, for low latency required by IoT applications, the cloud should be moved nearer to the IoT device and the cold start problem should be addressed. Using a real-world dataset, we showed through implementation in an open-source cloud environment based on Knative that a serverless approach to manage IoT traffic is feasible, uses less resources than a serverfull approach and traffic prediction with prefetching can mitigate the cold start delay penalty. However applying the Knative framework directly to IoT traffic without considering the execution context gives unnecessary overhead.more » « less
-
Fog computing has been advocated as an enabling technology for computationally intensive services in connected smart vehicles. Most existing works focus on analyzing and opti- mizing the queueing and workload processing latencies, ignoring the fact that the access latency between vehicles and fog/cloud servers can sometimes dominate the end-to-end service latency. This motivates the work in this paper, where we report a five- month urban measurement study of the wireless access latency between a connected vehicle and a fog computing system sup- ported by commercially available multi-operator LTE networks. We propose AdaptiveFog, a novel framework for autonomous and dynamic switching between different LTE operators that implement fog/cloud infrastructure. The main objective here is to maximize the service confidence level, defined as the probability that the tolerable latency threshold for each supported type of service can be guaranteed. AdaptiveFog has been implemented on a smart phone app, running on a moving vehicle. The app periodically measures the round-trip time between the vehicle and fog/cloud servers. An empirical spatial statistic model is established to characterize the spatial variation of the latency across the main driving routes of the city. To quantify the perfor- mance difference between different LTE networks, we introduce the weighted Kantorovich-Rubinstein (K-R) distance. An optimal policy is derived for the vehicle to dynamically switch between LTE operators’ networks while driving. Extensive analysis and simulation are performed based on our latency measurement dataset. Our results show that AdaptiveFog achieves around 30% and 50% improvement in the confidence level of fog and cloud latency, respectively.more » « less
-
Serverless computing allows customers to submit their jobs to the cloud for execution, with the resource provisioning being taken care of by the cloud provider. Serverless functions are often short-lived and have modest resource requirements, thereby presenting an opportunity to improve server utilization by colocating with latency-sensitive customer workloads. This paper presents ServerMore, a server-level resource manager that opportunistically colocates customer serverless jobs with serverful customer VMs. ServerMore dynamically regulates the CPU, memory bandwidth, and LLC resources on the server to ensure that the colocation between serverful and serverless workloads does not impact application tail latencies. By selectively admitting serverless functions and inferring the performance of black-box serverful workloads, ServerMore improves resource utilization on average by 35.9% to 245% compared to prior works; while having a minimal impact on the latency of both serverful applications and serverless functions.more » « less
-
The increased use of micro-services to build web applications has spurred the rapid growth of Function-as-a-Service (FaaS) or serverless computing platforms. While FaaS simplifies provisioning and scaling for application developers, it introduces new challenges in resource management that need to be handled by the cloud provider. Our analysis of popular serverless workloads indicates that schedulers need to handle functions that are very short-lived, have unpredictable arrival patterns, and require expensive setup of sandboxes. The challenge of running a large number of such functions in a multi-tenant cluster makes existing scheduling frameworks unsuitable. We present Archipelago, a platform that enables low latency request execution in a multi-tenant serverless setting. Archipelago views each application as a DAG of functions, and every DAG in associated with a latency deadline. Archipelago achieves its per-DAG request latency goals by: (1) partitioning a given cluster into a number of smaller worker pools, and associating each pool with a semi-global scheduler (SGS), (2) using a latency-aware scheduler within each SGS along with proactive sandbox allocation to reduce overheads, and (3) using a load balancing layer to route requests for different DAGs to the appropriate SGS, and automatically scale the number of SGSs per DAG. Our testbed results show that Archipelago meets the latency deadline for more than 99% of realistic application request workloads, and reduces tail latencies by up to 36X compared to state-of-the-art serverless platforms.more » « less
An official website of the United States government

