skip to main content


This content will become publicly available on July 1, 2024

Title: INDIANA—In-Network Distributed Infrastructure for Advanced Network Applications

Data volumes are exploding as sensors proliferate and become more capable. Edge computing is envisioned as a path to distribute processing and reduce latency. Many models of Edge computing consider small devices running conventional software. Our model includes a more lightweight execution engine for network microservices and a network scheduling framework to configure network processing elements to process streams and direct the appropriate traffic to them. In this article, we describe INDIANA, a complete framework for in-network microservices. We will describe how the two components-the INDIANA network Processing Element (InPE) and the Flange Network Operating System (NOS)-work together to achieve effective in-network processing to improve performance in edge to cloud environments. Our processing elements provide lightweight compute units optimized for efficient stream processing. These elements are customizable and vary in sophistication and resource consumption. The Flange NOS provides first-class flow based reasoning to drive function placement, network configuration, and load balancing that can respond dynamically to network conditions. We describe design considerations and discuss our approach and implementations. We evaluate the performance of stream processing and examine the performance of several exemplar applications on networks of increasing scale and complexity.

 
more » « less
Award ID(s):
2126266
NSF-PAR ID:
10473346
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
The International Journal of High Performance Computing Applications
Date Published:
Journal Name:
The International Journal of High Performance Computing Applications
Volume:
37
Issue:
3-4
ISSN:
1094-3420
Page Range / eLocation ID:
442 to 461
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Serverless computing promises an efficient, low-cost compute capability in cloud environments. However, existing solutions, epitomized by open-source platforms such as Knative, include heavyweight components that undermine this goal of serverless computing. Additionally, such serverless platforms lack dataplane optimizations to achieve efficient, high-performance function chains that facilitate the popular microservices development paradigm. Their use of unnecessarily complex and duplicate capabilities for building function chains severely degrades performance. 'Cold-start' latency is another deterrent. We describe SPRIGHT, a lightweight, high-performance, responsive serverless framework. SPRIGHT exploits shared memory processing and dramatically improves the scalability of the dataplane by avoiding unnecessary protocol processing and serialization-deserialization overheads. SPRIGHT extensively leverages event-driven processing with the extended Berkeley Packet Filter (eBPF). We creatively use eBPF's socket message mechanism to support shared memory processing, with overheads being strictly load-proportional. Compared to constantly-running, polling-based DPDK, SPRIGHT achieves the same dataplane performance with 10× less CPU usage under realistic workloads. Additionally, eBPF benefits SPRIGHT, by replacing heavyweight serverless components, allowing us to keep functions 'warm' with negligible penalty. Our preliminary experimental results show that SPRIGHT achieves an order of magnitude improvement in throughput and latency compared to Knative, while substantially reducing CPU usage, and obviates the need for 'cold-start'. 
    more » « less
  2. Emerging distributed cloud architectures, e.g., fog and mobile edge computing, are playing an increasingly impor-tant role in the efficient delivery of real-time stream-processing applications (also referred to as augmented information services), such as industrial automation and metaverse experiences (e.g., extended reality, immersive gaming). While such applications require processed streams to be shared and simultaneously consumed by multiple users/devices, existing technologies lack efficient mechanisms to deal with their inherent multicast na-ture, leading to unnecessary traffic redundancy and network congestion. In this paper, we establish a unified framework for distributed cloud network control with generalized (mixed-cast) traffic flows that allows optimizing the distributed execution of the required packet processing, forwarding, and replication operations. We first characterize the enlarged multicast network stability region under the new control framework (with respect to its unicast counterpart). We then design a novel queuing system that allows scheduling data packets according to their current destination sets, and leverage Lyapunov drift-plus-penalty con-trol theory to develop the first fully decentralized, throughput-and cost-optimal algorithm for multicast flow control. Numerical experiments validate analytical results and demonstrate the performance gain of the proposed design over existing network control policies. 
    more » « less
  3. Recent breakthroughs in deep learning (DL) have led to the emergence of many intelligent mobile applications and services, but in the meanwhile also pose unprecedented computing challenges on resource-constrained mobile devices. This paper builds a collaborative deep inference system between a resource-constrained mobile device and a powerful edge server, aiming at joining the power of both on-device processing and computation offloading. The basic idea of this system is to partition a deep neural network (DNN) into a front-end part running on the mobile device and a back-end part running on the edge server, with the key challenge being how to locate the optimal partition point to minimize the end-to-end inference delay. Unlike existing efforts on DNN partitioning that rely heavily on a dedicated offline profiling stage to search for the optimal partition point, our system has a built-in online learning module, called Autodidactic Neurosurgeon (ANS), to automatically learn the optimal partition point on-the-fly. Therefore, ANS is able to closely follow the changes of the system environment by generating new knowledge for adaptive decision making. The core of ANS is a novel contextual bandit learning algorithm, called μLinUCB, which not only has provable theoretical learning performance guarantee but also is ultra-lightweight for easy real-world implementation. We implement our system on a video stream object detection testbed to validate the design of ANS and evaluate its performance. The experiments show that ANS significantly outperforms state-of-the-art benchmarks in terms of tracking system changes and reducing the end-to-end inference delay. 
    more » « less
  4. Blockchain technology has been recognized as a promising solution to enhance the security and privacy of Internet of Things (IoT) and Edge Computing scenarios. Taking advantage of the Proof-of-Work (PoW) consensus protocol, which solves a computation intensive hashing puzzle, Blockchain ensures the security of the system by establishing a digital ledger. However, the computation intensive PoW favors members possessing more computing power. In the IoT paradigm, fairness in the highly heterogeneous network edge environments must consider devices with various constraints on computation power. Inspired by the advanced features of Digital Twins (DT), an emerging concept that mirrors the lifespan and operational characteristics of physical objects, we propose a novel Miner Twins (MinT) architecture to enable a fair PoW consensus mechanism for blockchains in IoT environments. MinT adopts an edge-fog-cloud hierarchy. All physical miners of the blockchain are deployed as microservices on distributed edge devices, while fog/cloud servers maintain digital twins that periodically update miners’ running status. By timely monitoring of a miner’s footprint that is mirrored by twins, a lightweight Singular Spectrum Analysis (SSA)-based detection achieves the identification of individual misbehaved miners that violate fair mining. Moreover, we also design a novel Proof-of-Behavior (PoB) consensus algorithm to detect dishonest miners that collude to control a fair mining network. A preliminary study is conducted on a proof-of-concept prototype implementation, and experimental evaluation shows the feasibility and effectiveness of the proposed MinT scheme under a distributed byzantine network environment. 
    more » « less
  5. As the machine learning and systems communities strive to achieve higher energy-efficiency through custom deep neural network (DNN) accelerators, varied precision or quantization levels, and model compression techniques, there is a need for design space exploration frameworks that incorporate quantization-aware processing elements into the accelerator design space while having accurate and fast power, performance, and area models. In this work, we present QUIDAM , a highly parameterized quantization-aware DNN accelerator and model co-exploration framework. Our framework can facilitate future research on design space exploration of DNN accelerators for various design choices such as bit precision, processing element type, scratchpad sizes of processing elements, global buffer size, number of total processing elements, and DNN configurations. Our results show that different bit precisions and processing element types lead to significant differences in terms of performance per area and energy. Specifically, our framework identifies a wide range of design points where performance per area and energy varies more than 5 × and 35 ×, respectively. With the proposed framework, we show that lightweight processing elements achieve on par accuracy results and up to 5.7 × more performance per area and energy improvement when compared to the best INT16 based implementation. Finally, due to the efficiency of the pre-characterized power, performance, and area models, QUIDAM can speed up the design exploration process by 3-4 orders of magnitude as it removes the need for expensive synthesis and characterization of each design. 
    more » « less