Fog computing, which distributes computing resources to multiple locations between the Internet of Things (IoT) devices and the cloud, is attracting considerable attention from academia and industry. Yet, despite the excitement about the potential of fog computing, few comprehensive studies quantitatively characterizing the properties of fog computing architectures have been conducted. In this paper we examine the statistical properties of fog computing task completion latencies, which are important to understand to develop algorithms that match IoT nodes’ tasks with the best execution points within the fog computing substrate. Towards characterizing task completion latencies, we developed and deployed a set of benchmarks in 6 different locations, which included local nodes of different grades, conventional cloud computing services in two different regions, and Amazon Web Services (AWS) and Microsoft Azure serverless computing options. Using the developed infrastructure, we conducted a series of targeted experiments with a node invoking our benchmarks from different locations and in different conditions. The empirical study elucidated several important properties of task execution latencies, including latency variation across different execution points and execution options, and stability with respect to time. The study also demonstrated important properties of serverless execution options, and showed that statistical structure of computing latencies can be accurately characterized based on a small number (only 10–50) of latency samples. The complete measurement set we have captured as part of this study is publicly available.
more »
« less
Exploiting Equivalence to Efficiently Enhance the Accuracy of Cognitive Services
Equivalent services deliver the same functionality with dissimilar non-functional characteristics, including latency, accuracy, and cost. With these dissimilarities in mind, developers can exploit the combined execution of equivalent services to increase accuracy, shorten latency, or reduce cost. However, it remains unknown how to effectively combine equivalent services to satisfy application requirements. With the recent surge in popularity of machine learning, different vendors offer a plethora of equivalent services, whose characteristics are mostly undocumented. As a result, developers cannot make an informed decision about which service to select from a set of equivalent services. To address this problem, we explore different service combination strategies (i.e., majority voting, weighted-majority voting, stacking, and custom) to ascertain their impact on nonfunctional characteristics. In particular, we study how these strategies impact the accuracy, cost, and latency of the face detection task and validate our findings on the sentiment analysis task. We consider the combined executions of commercial web services, deployed in the cloud, and open-source implementations, deployed as edge services. Our evaluation reveals that the combined execution of equivalent services is most effective for improving cost and latency. Informed by our experimental results, we formulate practical guidelines to help developers identify the best execution strategy for a given set of services.
more »
« less
- Award ID(s):
- 1717065
- PAR ID:
- 10154790
- Date Published:
- Journal Name:
- 2019 IEEE International Conference on Cloud Computing Technology and Science (CloudCom)
- Page Range / eLocation ID:
- 143 to 150
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
We study the voting game where agents' preferences are endogenously decided by the information they receive, and they can collaborate in a group. We show that strategic voting behaviors have a positive impact on leading to the "correct" decision, outperforming the common non-strategic behavior of informative voting and sincere voting. Our results give merit to strategic voting for making good decisions. To this end, we investigate a natural model, where voters' preferences between two alternatives depend on a discrete state variable that is not directly observable. Each voter receives a private signal that is correlated with the state variable. We reveal a surprising equilibrium between a strategy profile being a strong equilibrium and leading to the decision favored by the majority of agents conditioned on them knowing the ground truth (referred to as the informed majority decision): as the size of the vote goes to infinity, every ε-strong Bayes Nash Equilibrium with ε converging to 0 formed by strategic agents leads to the informed majority decision with probability converging to 1. On the other hand, we show that informative voting leads to the informed majority decision only under unbiased instances, and sincere voting leads to the informed majority decision only when it also forms an equilibrium.more » « less
-
Data-driven causality discovery is a common way to understand causal relationships among different components of a system. We study how to achieve scalable data-driven causal- ity discovery on Amazon Web Services (AWS) and Microsoft Azure cloud and propose a causality discovery as a service (CDaaS) framework. With this framework, users can easily re- run previous causality discovery experiments or run causality discovery with different setups (such as new datasets or causality discovery parameters). Our CDaaS leverages Cloud Container Registry service and Virtual Machine service to achieve scal- able causality discovery with different discovery algorithms. We further did extensive experiments and benchmarking of our CDaaS to understand the effects of seven factors (big data engine parameter setting, virtual machine instance number, type, subtype, size, cloud service, cloud provider) and how to best provision cloud resources for our causality discovery service based on certain goals including execution time, budgetary cost and cost-performance ratio. We report our findings from the benchmarking, which can help obtain optimal configurations based on each application’s characteristics. The findings show proper configurations could lead to both faster execution time and less budgetary cost.more » « less
-
With faster wireless networks and server GPUs, offloading high-accuracy but compute-intensive AR tasks implemented in Deep Neural Networks (DNNs) to edge servers offers a promising way to support high-QoE Augmented/Mixed Reality (AR/MR) applications. A cost-effective way for AR app vendors to deploy such edge-assisted AR apps to support a large user base is to use commercial Machine-Learning-as-a-Service (MLaaS) deployed at the edge cloud. To maximize cost-effectiveness, such an MLaaS provider faces a key design challenge, \ie how to maximize the number of clients concurrently served by each GPU server in its cluster while meeting per-client AR task accuracy SLAs. The above AR offloading inference serving problem differs from generic inference serving or video analytics serving in one fundamental way: due to the use of local tracking which reuses the last server-returned inference result to derive results for the current frame, the offloading frequency and end-to-end latency of each AR client directly affect its AR task accuracy (for all the frames). In this paper, we present ARISE, a framework that optimizes the edge server capacity in serving edge-assisted AR clients. Our design exploits the intricate interplay between per-client offloading schedule and batched inference on the server via proactively coordinating offloading request streams from different AR clients. Our evaluation using a large set of emulated AR clients and a 10-phone testbed shows that \name supports 1.7x--6.9x more clients compared to various baselines while keeping the per-client accuracy within the client-specified accuracy SLAs.more » « less
-
Many distributed applications rely on the strong guarantees of sequential consistency to ensure program correctness. Replication systems or frameworks that support such applications typically implement sequential consistency by em- ploying voting schemes among replicas. However, such schemes suffer dramatic performance loss when deployed globally due to increased long-haul message latency between replicas in separate data centers. One approach to overcome this challenge involves deploying distinct instances of a service in each geographic cluster, then loosely coupling those services. Unfortunately, the consistency guarantees of the individual replication system in- stances do not compose when coupled this way, sacrificing overall sequential consistency. We propose an alternative approach, the consistent, propagatable partition tree (CoPPar Tree), a data structure that spans multiple data centers and data partitions, and that realizes sequential consistency using divide-and-conquer. By leveraging the geospatial affinity of data used in global services, CoPPar Tree can localize reads and writes in a sequentially consistent manner, improving the overall performance of a sequentially consistent service deployed at global scale. Our work allows clients to access local data and fully run SMR protocols locally without additional overhead. We implemented CoPPar Tree by enhancing ZooKeeper with an extension called ZooTree, which can be deployed without changing existing ZooKeeper clusters, and which achieves a speedup of 100×for reads and up to 10× for writes over prior work.more » « less
An official website of the United States government

