skip to main content


Title: CHARIOT: Goal-driven Orchestration Middleware for Resilient IoT Systems
An emerging trend in Internet of Things (IoT) applications is to move the computation (cyber) closer to the source of the data (physical). This paradigm is often referred to as edge computing. If edge resources are pooled together they can be used as decentralized shared resources for IoT applications, providing increased capacity to scale up computations and minimize end-to-end latency. Managing applications on these edge resources is hard, however, due to their remote, distributed, and (possibly) dynamic nature, which necessitates autonomous management mechanisms that facilitate application deployment, failure avoidance, failure management, and incremental updates. To address these needs, we present CHARIOT, which is orchestration middleware capable of autonomously managing IoT systems consisting of edge resources and applications. CHARIOT implements a three-layer architecture. The topmost layer comprises a system description language, the middle layer comprises a persistent data storage layer and the corresponding schema to store system information, and the bottom layer comprises a management engine that uses information stored persistently to formulate constraints that encode system properties and requirements, thereby enabling the use of Satisfiability Modulo Theories (SMT) solvers to compute optimal system (re)configurations dynamically at runtime. This paper describes the structure and functionality of CHARIOT and evaluates its efficacy as the basis for a smart parking system case study that uses sensors to manage parking spaces  more » « less
Award ID(s):
1528799
NSF-PAR ID:
10054148
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
ACM Transactions on Cyber-Physical Systems
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. As scaling of conventional memory devices has stalled, many high-end computing systems have begun to incorporate alternative memory technologies to meet performance goals. Since these technologies present distinct advantages and tradeoffs compared to conventional DDR* SDRAM, such as higher bandwidth with lower capacity or vice versa, they are typically packaged alongside conventional SDRAM in a heterogeneous memory architecture. To utilize the different types of memory efficiently, new data management strategies are needed to match application usage to the best available memory technology. However, current proposals for managing heterogeneous memories are limited, because they either (1) do not consider high-level application behavior when assigning data to different types of memory or (2) require separate program execution (with a representative input) to collect information about how the application uses memory resources. This work presents a new data management toolset to address the limitations of existing approaches for managing complex memories. It extends the application runtime layer with automated monitoring and management routines that assign application data to the best tier of memory based on previous usage, without any need for source code modification or a separate profiling run. It evaluates this approach on a state-of-the-art server platform with both conventional DDR4 SDRAM and non-volatile Intel Optane DC memory, using both memory-intensive high-performance computing (HPC) applications as well as standard benchmarks. Overall, the results show that this approach improves program performance significantly compared to a standard unguided approach across a variety of workloads and system configurations. The HPC applications exhibit the largest benefits, with speedups ranging from 1.4× to 7× in the best cases. Additionally, we show that this approach achieves similar performance as a comparable offline profiling-based approach after a short startup period, without requiring separate program execution or offline analysis steps. 
    more » « less
  2. In the era of big data, materials science workflows need to handle large-scale data distribution, storage, and computation. Any of these areas can become a performance bottleneck. We present a framework for analyzing internal material structures (e.g., cracks) to mitigate these bottlenecks. We demonstrate the effectiveness of our framework for a workflow performing synchrotron X-ray computed tomography reconstruction and segmentation of a silica-based structure. Our framework provides a cloud-based, cutting-edge solution to challenges such as growing intermediate and output data and heavy resource demands during image reconstruction and segmentation. Specifically, our framework efficiently manages data storage, scaling up compute resources on the cloud. The multi-layer software structure of our framework includes three layers. A top layer uses Jupyter notebooks and serves as the user interface. A middle layer uses Ansible for resource deployment and managing the execution environment. A low layer is dedicated to resource management and provides resource management and job scheduling on heterogeneous nodes (i.e., GPU and CPU). At the core of this layer, Kubernetes supports resource management, and Dask enables large-scale job scheduling for heterogeneous resources. The broader impact of our work is four-fold: through our framework, we hide the complexity of the cloud’s software stack to the user who otherwise is required to have expertise in cloud technologies; we manage job scheduling efficiently and in a scalable manner; we enable resource elasticity and workflow orchestration at a large scale; and we facilitate moving the study of nonporous structures, which has wide applications in engineering and scientific fields, to the cloud. While we demonstrate the capability of our framework for a specific materials science application, it can be adapted for other applications and domains because of its modular, multi-layer architecture. 
    more » « less
  3. The Internet of Things (IoT) is a vast collection of interconnected sensors, devices, and services that share data and information over the Internet with the objective of leveraging multiple information sources to optimize related systems. The technologies associated with the IoT have significantly improved the quality of many existing applications by reducing costs, improving functionality, increasing access to resources, and enhancing automation. The adoption of IoT by industries has led to the next industrial revolution: Industry 4.0. The rise of the Industrial IoT (IIoT) promises to enhance factory management, process optimization, worker safety, and more. However, the rollout of the IIoT is not without significant issues, and many of these act as major barriers that prevent fully achieving the vision of Industry 4.0. One major area of concern is the security and privacy of the massive datasets that are captured and stored, which may leak information about intellectual property, trade secrets, and other competitive knowledge. As a way forward toward solving security and privacy concerns, we aim in this paper to identify common input-output (I/O) design patterns that exist in applications of the IIoT. These design patterns enable constructing an abstract model representation of data flow semantics used by such applications, and therefore better understand how to secure the information related to IIoT operations. In this paper, we describe communication protocols and identify common I/O design patterns for IIoT applications with an emphasis on data flow in edge devices, which, in the industrial control system (ICS) setting, are most often involved in process control or monitoring. 
    more » « less
  4. Edge Cloud (EC) is poised to brace massive machine type communication (mMTC) for 5G and IoT by providing compute and network resources at the edge. Yet, the EC being regionally domestic with a smaller scale, faces the challenges of bandwidth and computational throughput. Resource management techniques are considered necessary to achieve efficient resource allocation objectives. Software Defined Network (SDN) enabled EC architecture is emerging as a potential solution that enables dynamic bandwidth allocation and task scheduling for latency sensitive and diverse mobile applications in the EC environment. This study proposes a novel Heuristic Reinforcement Learning (HRL) based flowlevel dynamic bandwidth allocation framework and validates it through end-to-end implementation using OpenFlow meter feature. OpenFlow meter provides granular control and allows demand-based flow management to meet the diverse QoS requirements germane to IoT traffics. The proposed framework is then evaluated by emulating an EC scenario based on real NSF COSMOS testbed topology at The City College of New York. A specific heuristic reinforcement learning with linear-annealing technique and a pruning principle are proposed and compared with the baseline approach. Our proposed strategy performs consistently in both Mininet and hardware OpenFlow switches based environments. The performance evaluation considers key metrics associated with real-time applications: throughput, end-to-end delay, packet loss rate, and overall system cost for bandwidth allocation. Furthermore, our proposed linear annealing method achieves faster convergence rate and better reward in terms of system cost, and the proposed pruning principle remarkably reduces control traffic in the network. 
    more » « less
  5. There is increasing interest in deploying building-scale, general-purpose, and high-fidelity sensing to drive emerging smart building applications. However, the real-world deployment of such systems is challenging due to the lack of system and architectural support. Most existing sensing systems are purpose-built, consisting of hardware that senses a limited set of environmental facets, typically at low fidelity and for short-term deployment. Furthermore, prior systems with high-fidelity sensing and machine learning fail to scale effectively and have fewer primitives, if any, for privacy and security. For these reasons, IoT deployments in buildings are generally short-lived or done as a proof of concept. We present the design of Mites, a scalable end-to-end hardware-software system for supporting and managing distributed general-purpose sensors in buildings. Our design includes robust primitives for privacy and security, essential features for scalable data management, as well as machine learning to support diverse applications in buildings. We deployed our Mites system and 314 Mites devices in Tata Consultancy Services (TCS) Hall at Carnegie Mellon University (CMU), a fully occupied, five-story university building. We present a set of comprehensive evaluations of our system using a series of microbenchmarks and end-to-end evaluations to show how we achieved our stated design goals. We include five proof-of-concept applications to demonstrate the extensibility of the Mites system to support compelling IoT applications. Finally, we discuss the real-world challenges we faced and the lessons we learned over the five-year journey of our stack's iterative design, development, and deployment.

     
    more » « less