skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Storm-RTS: Stream Processing with Stable Performance for Multi-Cloud and Cloud-edge
Stream Processing Engines (SPEs) traditionally de-ploy applications on a set of shared workers (e.g., threads, processes, or containers) requiring complex performance man-agement by SPEs and application developers. We explore a new approach that replaces workers with Rate-based Abstract Ma-chines (RBAMs). This allows SPEs to translate stream operations into FaaS invocations, and exploit guaranteed invocation rates to manage performance. This approach enables SPE applications to achieve transparent and predictable performance. We realize the approach in the Storm-RTS system. Exploring 36 stream processing scenarios over 5 different hardware config-urations, we demonstrate several key advantages. First, Storm-RTS provides stable application performance and can enable flexible reconfiguration across cloud resource configurations. Sec-ond, SPEs built on RBAM can be resource-efficient and scalable. Finally, Storm-RTS allows the stream-processing paradigm to be extended from the cloud to the edge, using its performance stability to hide edge heterogeneity and resource competition. An experiment with 4 cloud and edge sites over 300 cores shows how Storm-RTS can support flexible reconfiguration and simple high-level declarative policies that optimize resource cost or other criteria.  more » « less
Award ID(s):
1901466
PAR ID:
10495146
Author(s) / Creator(s):
;
Publisher / Repository:
IEEE
Date Published:
ISBN:
979-8-3503-0481-7
Page Range / eLocation ID:
45 to 57
Format(s):
Medium: X
Location:
Chicago, IL, USA
Sponsoring Org:
National Science Foundation
More Like this
  1. The paradigm shift of deploying applications to the cloud has introduced both opportunities and challenges. Although clouds use elasticity to scale resource usage at runtime to help meet an application’s performance requirements, developers are still challenged by unpredictable performance, little control of execution environment, and differences among cloud service providers, all while being charged for their cloud usages. Application performance stability is particularly affected by multi-tenancy in which the hardware is shared among varying applications and virtual machines. Developers porting their applications need to meet performance requirements, but testing on the cloud under the effects of performance uncertainty is difficult and expensive, due to high cloud usage costs. This paper presents a first approach to testing an application with typical inputs for how its performance will be affected by performance uncertainty, without incurring undue costs of bruteforce testing in the cloud. We specify cloud uncertainty testing criteria, design a test-based strategy to characterize the blackbox cloud’s performance distributions using these testing criteria, and support execution of tests to characterize the resource usage and cloud baseline performance of the application to be deployed. Importantly, we developed a smart test oracle that estimates the application’s performance with certain confidence levels using the above characterization test results and determines whether it will meet its performance requirements. We evaluated our testing approach on both the Chameleon cloud and Amazon web services; results indicate that this testing strategy shows promise as a cost-effective approach to test for performance effects of cloud uncertainty when porting an application to the cloud. 
    more » « less
  2. In this paper, we present design, implementation and evaluation of a novel predictive control framework to enable reliable distributed stream data processing, which features a Deep Recurrent Neural Network (DRNN) model for performance prediction, and dynamic grouping for flexible control. Specifically, we present a novel DRNN model, which makes accurate performance prediction with careful consideration for interference of co-located worker processes, according to multilevel runtime statistics. Moreover, we design a new grouping method, dynamic grouping, which can distribute/re-distribute data tuples to downstream tasks according to any given split ratio on the fly. So it can be used to re-direct data tuples to bypass misbehaving workers. We implemented the proposed framework based on a widely used Distributed Stream Data Processing System (DSDPS), Storm. For validation and performance evaluation, we developed two representative stream data processing applications: Windowed URL Count and Continuous Queries. Extensive experimental results show: 1) The proposed DRNN model outperforms widely used baseline solutions, ARIMA and SVR, in terms of prediction accuracy; 2) dynamic grouping works as expected; and 3) the proposed framework enhances reliability by offering minor performance degradation with misbehaving workers. 
    more » « less
  3. null (Ed.)
    Many Internet of Things (IoT) applications are time-critical and dynamically changing. However, traditional data processing systems (e.g., stream processing systems, cloud-based IoT data processing systems, wide-area data analytics systems) are not well-suited for these IoT applications. These systems often do not scale well with a large number of concurrently running IoT applications, do not support low-latency processing under limited computing resources, and do not adapt to the level of heterogeneity and dynamicity commonly present at edge environments. This suggests a need for a new edge stream processing system that advances the stream processing paradigm to achieve efficiency and flexibility under the constraints presented by edge computing architectures. We present \textsc{Dart}, a scalable and adaptive edge stream processing engine that enables fast processing of a large number of concurrent running IoT applications’ queries in dynamic edge environments. The novelty of our work is the introduction of a dynamic dataflow abstraction by leveraging distributed hash table (DHT) based peer-to-peer (P2P) overlay networks, which can automatically place, chain, and scale stream operators to reduce query latency, adapt to edge dynamics, and recover from failures. We show analytically and empirically that DART outperforms Storm and EdgeWise on query latency and significantly improves scalability and adaptability when processing a large number of real-world IoT stream applications' queries. DART significantly reduces application deployment setup times, becoming the first streaming engine to support DevOps for IoT applications on edge platforms. 
    more » « less
  4. IoT (Internet of Things) devices such as sensors have been actively used in 'fogs' to provide critical data during e.g., disaster response scenarios or in-home healthcare. Since IoT devices typically operate in resource-constrained computing environments at the network-edge, data transfer performance to the cloud as well as end-to-end security have to be robust and customizable. In this paper, we present the design and implementation of a middleware featuring "intermittent" and "flexible" end-to-end security for cloud-fog communications. Intermittent security copes with unreliable network connections, and flexibility is achieved through security configurations that are tailored to application needs. Our experiment results show how our middleware that leverages static pre-shared keys forms a promising solution for delivering light-weight, fast and resource-aware security for a variety of IoT-based applications. 
    more » « less
  5. Emerging distributed cloud architectures, e.g., fog and mobile edge computing, are playing an increasingly impor-tant role in the efficient delivery of real-time stream-processing applications (also referred to as augmented information services), such as industrial automation and metaverse experiences (e.g., extended reality, immersive gaming). While such applications require processed streams to be shared and simultaneously consumed by multiple users/devices, existing technologies lack efficient mechanisms to deal with their inherent multicast na-ture, leading to unnecessary traffic redundancy and network congestion. In this paper, we establish a unified framework for distributed cloud network control with generalized (mixed-cast) traffic flows that allows optimizing the distributed execution of the required packet processing, forwarding, and replication operations. We first characterize the enlarged multicast network stability region under the new control framework (with respect to its unicast counterpart). We then design a novel queuing system that allows scheduling data packets according to their current destination sets, and leverage Lyapunov drift-plus-penalty con-trol theory to develop the first fully decentralized, throughput-and cost-optimal algorithm for multicast flow control. Numerical experiments validate analytical results and demonstrate the performance gain of the proposed design over existing network control policies. 
    more » « less