The paradigm shift of deploying applications to
the cloud has introduced both opportunities and challenges.
Although clouds use elasticity to scale resource usage at runtime
to help meet an application’s performance requirements,
developers are still challenged by unpredictable performance,
little control of execution environment, and differences among
cloud service providers, all while being charged for their cloud
usages. Application performance stability is particularly affected
by multi-tenancy in which the hardware is shared among varying
applications and virtual machines. Developers porting their
applications need to meet performance requirements, but testing
on the cloud under the effects of performance uncertainty is
difficult and expensive, due to high cloud usage costs.
This paper presents a first approach to testing an application
with typical inputs for how its performance will be affected by
performance uncertainty, without incurring undue costs of bruteforce
testing in the cloud. We specify cloud uncertainty testing
criteria, design a test-based strategy to characterize the blackbox
cloud’s performance distributions using these testing criteria,
and support execution of tests to characterize the resource usage
and cloud baseline performance of the application to be deployed.
Importantly, we developed a smart test oracle that estimates the
application’s performance with certain confidence levels using
the above characterization test results and determines whether
it will meet its performance requirements. We evaluated our
testing approach on both the Chameleon cloud and Amazon web
services; results indicate that this testing strategy shows promise
as a cost-effective approach to test for performance effects of
cloud uncertainty when porting an application to the cloud.
more »
« less
Storm-RTS: Stream Processing with Stable Performance for Multi-Cloud and Cloud-edge
Stream Processing Engines (SPEs) traditionally de-ploy applications on a set of shared workers (e.g., threads, processes, or containers) requiring complex performance man-agement by SPEs and application developers. We explore a new approach that replaces workers with Rate-based Abstract Ma-chines (RBAMs). This allows SPEs to translate stream operations into FaaS invocations, and exploit guaranteed invocation rates to manage performance. This approach enables SPE applications to achieve transparent and predictable performance. We realize the approach in the Storm-RTS system. Exploring 36 stream processing scenarios over 5 different hardware config-urations, we demonstrate several key advantages. First, Storm-RTS provides stable application performance and can enable flexible reconfiguration across cloud resource configurations. Sec-ond, SPEs built on RBAM can be resource-efficient and scalable. Finally, Storm-RTS allows the stream-processing paradigm to be extended from the cloud to the edge, using its performance stability to hide edge heterogeneity and resource competition. An experiment with 4 cloud and edge sites over 300 cores shows how Storm-RTS can support flexible reconfiguration and simple high-level declarative policies that optimize resource cost or other criteria.
more »
« less
- Award ID(s):
- 1901466
- NSF-PAR ID:
- 10495146
- Publisher / Repository:
- IEEE
- Date Published:
- ISBN:
- 979-8-3503-0481-7
- Page Range / eLocation ID:
- 45 to 57
- Format(s):
- Medium: X
- Location:
- Chicago, IL, USA
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Many Internet of Things (IoT) applications are time-critical and dynamically changing. However, traditional data processing systems (e.g., stream processing systems, cloud-based IoT data processing systems, wide-area data analytics systems) are not well-suited for these IoT applications. These systems often do not scale well with a large number of concurrently running IoT applications, do not support low-latency processing under limited computing resources, and do not adapt to the level of heterogeneity and dynamicity commonly present at edge environments. This suggests a need for a new edge stream processing system that advances the stream processing paradigm to achieve efficiency and flexibility under the constraints presented by edge computing architectures. We present \textsc{Dart}, a scalable and adaptive edge stream processing engine that enables fast processing of a large number of concurrent running IoT applications’ queries in dynamic edge environments. The novelty of our work is the introduction of a dynamic dataflow abstraction by leveraging distributed hash table (DHT) based peer-to-peer (P2P) overlay networks, which can automatically place, chain, and scale stream operators to reduce query latency, adapt to edge dynamics, and recover from failures. We show analytically and empirically that DART outperforms Storm and EdgeWise on query latency and significantly improves scalability and adaptability when processing a large number of real-world IoT stream applications' queries. DART significantly reduces application deployment setup times, becoming the first streaming engine to support DevOps for IoT applications on edge platforms.more » « less
-
IoT (Internet of Things) devices such as sensors have been actively used in 'fogs' to provide critical data during e.g., disaster response scenarios or in-home healthcare. Since IoT devices typically operate in resource-constrained computing environments at the network-edge, data transfer performance to the cloud as well as end-to-end security have to be robust and customizable. In this paper, we present the design and implementation of a middleware featuring "intermittent" and "flexible" end-to-end security for cloud-fog communications. Intermittent security copes with unreliable network connections, and flexibility is achieved through security configurations that are tailored to application needs. Our experiment results show how our middleware that leverages static pre-shared keys forms a promising solution for delivering light-weight, fast and resource-aware security for a variety of IoT-based applications.more » « less
-
Edge computing has emerged as a popular paradigm for running latency-sensitive applications due to its ability to offer lower network latencies to end-users. In this paper, we argue that despite its lower network latency, the resource-constrained nature of the edge can result in higher end-to-end latency, especially at higher utilizations, when compared to cloud data centers. We study this edge performance inversion problem through an analytic comparison of edge and cloud latencies and analyze conditions under which the edge can yield worse performance than the cloud. To verify our analytic results, we conduct a detailed experimental comparison of the edge and the cloud latencies using a realistic application and real cloud workloads. Both our analytical and experimental results show that even at moderate utilizations, the edge queuing delays can offset the benefits of lower network latencies, and even result in performance inversion where running in the cloud would provide superior latencies. We finally discuss practical implications of our results and provide insights into how application designers and service providers should design edge applications and systems to avoid these pitfalls.more » « less
-
Mobile applications have become increasingly sophisticated. Emerging cognitive assistance applications can involve multiple computationally intensive modules working continuously and concurrently, further straining the already limited resources on these mobile devices. While computation offloading to the edge or the cloud is still the de facto solution, existing approaches are limited by intra-application operations only or edge-/cloud-centric scheduling. Instead, we argue that operating system level coordination is needed on the mobile side to adequately support the prospects of multi-application offloading. Specifically, both the local mobile system resource and the network bandwidth to reach the cloud need to be allocated intelligently among concurrent offloading jobs. In this paper, we build a system-level scheduler service, LinkShare, that wraps over the operating system scheduler to coordinate among multiple offloading requests. We further study the scheduling requirements and suitable metrics, and find that the most intuitive approaches of minimizing the end-to- end processing time or earliest-deadline first scheduling do not work well. Instead, LinkShare adopts earliest-deadline first with limited sharing (EDF-LS), that balances real-time requirements and fairness. Extensive evaluation of an Android implementation of LinkShare shows that adding this additional scheduler is essential, and that EDF-LS reduces the deadline miss events by up to 30% compared to the baseline.more » « less