skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: High-throughput Real-time Edge Stream Processing with Topology-Aware Resource Matching
With the proliferation of Internet of Things (IoT) devices, real-time stream processing at the edge of the network has gained significant attention. However, edge stream processing systems face substantial challenges due to the heterogeneity and constraints of computational and network resources and the intricacies of multi-tenant application hosting. An optimized placement strategy for edge application topology becomes crucial to leverage the advantages offered by Edge computing and enhance the throughput and end-to-end latency of data streams. This paper presents Beaver, a resource scheduling framework designed to efficiently deploy stream processing topologies across distributed edge nodes. Its core is a novel scheduler that employs a synergistic integration of graph partitioning within application topologies and a two-sided matching technique to optimize the strategic placement of stream operators. Beaver aims to achieve optimal performance by minimizing bottlenecks in the network, memory, and CPU resources at the edge. We implemented a prototype of Beaver using Apache Storm and Kubernetes orchestration engine and evaluated its performance using an open-source real-time IoT benchmark (RIoTBench). Compared to state-of-the-art techniques, experimental evaluations demonstrate at least 1.6× improvement in the number of tuples processed within a one-second deadline under varying network delay and bandwidth scenarios.  more » « less
Award ID(s):
2135439
PAR ID:
10530947
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
IEEE
Date Published:
Format(s):
Medium: X
Location:
Philadelphia, PA, USA
Sponsoring Org:
National Science Foundation
More Like this
  1. Next-generation stream processing systems for community scale IoT applications must handle complex nonfunctional needs, e.g. scalability of input, reliability/timeliness of communication and privacy/security of captured data. In many IoT settings, efficiently batching complex workflows remains challenging in resource-constrained environments. High data rates, combined with real-time processing needs for applications, have pointed to the need for efficient edge stream processing techniques. In this work, we focus on designing scalable edge stream processing workflows in real-world IoT deployments where performance and privacy are key concerns. Initial efforts have revealed that privacy policy execution/enforcement at the edge for intensive workloads is prohibitively expensive. Thus, we leverage intelligent batching techniques to enhance the performance and throughput of streaming in IoT smart spaces. We introduce BatchIT, a processing middleware based on a smart batching strategy that optimizes the trade-off between batching delay and the end-to-end delay requirements of IoT applications. Through experiments with a deployed system we demonstrate that BatchIT outperforms several approaches, including micro-batching and EdgeWise, while reducing computation overhead. 
    more » « less
  2. null (Ed.)
    Many Internet of Things (IoT) applications are time-critical and dynamically changing. However, traditional data processing systems (e.g., stream processing systems, cloud-based IoT data processing systems, wide-area data analytics systems) are not well-suited for these IoT applications. These systems often do not scale well with a large number of concurrently running IoT applications, do not support low-latency processing under limited computing resources, and do not adapt to the level of heterogeneity and dynamicity commonly present at edge environments. This suggests a need for a new edge stream processing system that advances the stream processing paradigm to achieve efficiency and flexibility under the constraints presented by edge computing architectures. We present \textsc{Dart}, a scalable and adaptive edge stream processing engine that enables fast processing of a large number of concurrent running IoT applications’ queries in dynamic edge environments. The novelty of our work is the introduction of a dynamic dataflow abstraction by leveraging distributed hash table (DHT) based peer-to-peer (P2P) overlay networks, which can automatically place, chain, and scale stream operators to reduce query latency, adapt to edge dynamics, and recover from failures. We show analytically and empirically that DART outperforms Storm and EdgeWise on query latency and significantly improves scalability and adaptability when processing a large number of real-world IoT stream applications' queries. DART significantly reduces application deployment setup times, becoming the first streaming engine to support DevOps for IoT applications on edge platforms. 
    more » « less
  3. Virtual Reality (VR)-based Learning Environments (VRLEs) are gaining popularity due to the wide availability of cloud and its edge (a.k.a. fog) technologies and high-speed networks. Thus, there is a need to investigate Internet-of-Things (IoT)-based application design concepts within social VRLEs to offer scalable, cost-efficient services that adapt to dynamic cloud/fog system conditions. In this paper, we investigate the costperformance trade-offs for an IoT-based application that integrates large-scale sensor data from Social VRLEs and coordinates the real-time data processing and visualization across cloud/fog platforms. To facilitate dynamic performance adaptation of the IoT-based application with increased user scale, we present a set of cost-aware adaptive control rules. The implementation of the rules is based on an analytical queuing model that determines the performance states of the IoT-based application, given the current workload and the allocated cloud/fog resources. Using the IoTbased application in an exemplar VRLE use case, we evaluate the cost-performance trade-offs with three system architectures i.e., cloud-only, edge-only and edge-cloud architectures. Experiment results illustrate the best/worst practices in the cost-performance trade-offs for a range of simulated IoT scenarios involving monitoring user emotional data collected by using brain sensors. Our results also detail the impact of the system architecture selection, and the benefits in enabling feedback about student emotions to instructors during Social VR learning sessions. Lastly, we show the benefits of integrating our model-based feedback control in maximizing IoT-based application performance while keeping the associated costs at a minimum level. 
    more » « less
  4. Virtual Reality (VR)-based Learning Environments (VRLEs) are gaining popularity due to the wide availability of cloud and its edge (a.k.a. fog) technologies and high-speed networks. Thus, there is a need to investigate Internet-of-Things (IoT)-based application design concepts within social VRLEs to offer scalable, cost-efficient services that adapt to dynamic cloud/fog system conditions. In this paper, we investigate the costperformance trade-offs for an IoT-based application that integrates large-scale sensor data from Social VRLEs and coordinates the real-time data processing and visualization across cloud/fog platforms. To facilitate dynamic performance adaptation of the IoT-based application with increased user scale, we present a set of cost-aware adaptive control rules. The implementation of the rules is based on an analytical queuing model that determines the performance states of the IoT-based application, given the current workload and the allocated cloud/fog resources. Using the IoTbased application in an exemplar VRLE use case, we evaluate the cost-performance trade-offs with three system architectures i.e., cloud-only, edge-only and edge-cloud architectures. Experiment results illustrate the best/worst practices in the cost-performance trade-offs for a range of simulated IoT scenarios involving monitoring user emotional data collected by using brain sensors. Our results also detail the impact of the system architecture selection, and the benefits in enabling feedback about student emotions to instructors during Social VR learning sessions. Lastly, we show the benefits of integrating our model-based feedback control in maximizing IoT-based application performance while keeping the associated costs at a minimum level. 
    more » « less
  5. We introduce Canal, a programmable, topic-based, publish/subscribe system that is designed for multi-tier cloud deployments (e.g. edge-cloud, multi-cloud, IoT-cloud, etc.). Canal implements a triggered computational (i.e. “serverless”) programming model and provides developers with a uniform and portable programming interface. To achieve scalability and reliability, Canal combines the use of a distributed hash table (DHT) and replica consensus protocol to distribute and replicate functions, state, and data. Canal also decouples replica placement from the DHT topology to allow developers to optimize function placement for different objectives. We evaluate Canal using a real-world multi-tier IoT deployment and we use Canal to compare placement strategies, end-to-end performance, and failure recovery using both benchmarks and a real-world IoT-edge application. Our results show that Canal is able to achieve both low latency and reliability in this setting. 
    more » « less