skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Keep Clear of the Edges : An Empirical Study of Artificial Intelligence Workload Performance and Resource Footprint on Edge Devices
Recently, with the advent of the Internet of everything and 5G network, the amount of data generated by various edge scenarios such as autonomous vehicles, smart industry, 4K/8K, virtual reality (VR), augmented reality (AR), etc., has greatly exploded. All these trends significantly brought real-time, hardware dependence, low power consumption, and security requirements to the facilities, and rapidly popularized edge computing. Meanwhile, artificial intelligence (AI) workloads also changed the computing paradigm from cloud services to mobile applications dramatically. Different from wide deployment and sufficient study of AI in the cloud or mobile platforms, AI workload performance and their resource impact on edges have not been well understood yet. There lacks an in-depth analysis and comparison of their advantages, limitations, performance, and resource consumptions in an edge environment. In this paper, we perform a comprehensive study of representative AI workloads on edge platforms. We first conduct a summary of modern edge hardware and popular AI workloads. Then we quantitatively evaluate three categories (i.e., classification, image-to-image, and segmentation) of the most popular and widely used AI applications in realistic edge environments based on Raspberry Pi, Nvidia TX2, etc. We find that interaction between hardware and neural network models incurs non-negligible impact and overhead on AI workloads at edges. Our experiments show that performance variation and difference in resource footprint limit availability of certain types of workloads and their algorithms for edge platforms, and users need to select appropriate workload, model, and algorithm based on requirements and characteristics of edge environments.  more » « less
Award ID(s):
2103405 2103459
PAR ID:
10379453
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
2022 IEEE International Performance, Computing, and Communications Conference (IPCCC)
Page Range / eLocation ID:
7 to 16
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Edge computing has emerged as a popular paradigm for supporting mobile and IoT applications with low latency or high bandwidth needs. The attractiveness of edge computing has been further enhanced due to the recent availability of special-purpose hardware to accelerate specific compute tasks, such as deep learning inference, on edge nodes. In this paper, we experimentally compare the benefits and limitations of using specialized edge systems, built using edge accelerators, to more traditional forms of edge and cloud computing. Our experimental study using edge-based AI workloads shows that today's edge accelerators can provide comparable, and in many cases better, performance, when normalized for power or cost, than traditional edge and cloud servers. They also provide latency and bandwidth benefits for split processing, across and within tiers, when using model compression or model splitting, but require dynamic methods to determine the optimal split across tiers. We find that edge accelerators can support varying degrees of concurrency for multi-tenant inference applications, but lack isolation mechanisms necessary for edge cloud multi-tenant hosting. 
    more » « less
  2. Since emerging edge applications such as Internet of Things (IoT) analytics and augmented reality have tight latency constraints, hardware AI accelerators have been recently proposed to speed up deep neural network (DNN) inference run by these applications. Resource-constrained edge servers and accelerators tend to be multiplexed across multiple IoT applications, introducing the potential for performance interference between latency-sensitive workloads. In this article, we design analytic models to capture the performance of DNN inference workloads on shared edge accelerators, such as GPU and edgeTPU, under different multiplexing and concurrency behaviors. After validating our models using extensive experiments, we use them to design various cluster resource management algorithms to intelligently manage multiple applications on edge accelerators while respecting their latency constraints. We implement a prototype of our system in Kubernetes and show that our system can host 2.3× more DNN applications in heterogeneous multi-tenant edge clusters with no latency violations when compared to traditional knapsack hosting algorithms. 
    more » « less
  3. This paper articulates our vision for a learning-based untrustworthy distributed database. We focus on permissioned blockchain systems as an emerging instance of untrustworthy distributed databases and argue that as novel smart contracts, modern hardware, and new cloud platforms arise, future-proof permissioned blockchain systems need to be designed withfull-stack adaptivityin mind. At the application level, a future-proof system must adaptively learn the best-performing transaction processing paradigm and quickly adapt to new hardware and unanticipated workload changes on the fly. Likewise, the Byzantine consensus layer must dynamically adjust itself to the workloads, faulty conditions, and network configuration while maintaining compatibility with the transaction processing paradigm. At the infrastructure level, cloud providers must enable cross-layer adaptation, which identifies performance bottlenecks and possible attacks, and determines at runtime the degree of resource disaggregation that best meets application requirements. Within this vision of the future, our paper outlines several research challenges together with some preliminary approaches. 
    more » « less
  4. Mobile devices supporting the "Internet of Things" (IoT), often have limited capabilities in computation, battery energy, and storage space, especially to support resource-intensive applications involving virtual reality (VR), augmented reality (AR), multimedia delivery and artificial intelligence (AI), which could require broad bandwidth, low response latency and large computational power. Edge cloud or edge computing is an emerging topic and technology that can tackle the deficiency of the currently centralized-only cloud computing model and move the computation and storage resource closer to the devices in support of the above-mentioned applications. To make this happen, efficient coordination mechanisms and “offloading” algorithms are needed to allow the mobile devices and the edge cloud to work together smoothly. In this survey paper, we investigate the key issues, methods, and various state-of-the-art efforts related to the offloading problem. We adopt a new characterizing model to study the whole process of offloading from mobile devices to the edge cloud. Through comprehensive discussions, we aim to draw an overall “big picture” on the existing efforts and research directions. Our study also indicates that the offloading algorithms in edge cloud have demonstrated profound potentials for future technology and application development. 
    more » « less
  5. The integration of onboard computing capabilities with unmanned aerial vehicles (UAV) has gained significant attention in recent years as part of mobile computing paradigms such as mobile edge computing (MEC), fog computing, and mobile cloud computing. To enhance the performance of airborne computing, networked airborne computing (NAC) aims to interconnect UAVs through direct flight-to-flight links, with UAVs sharing resources with each other. However, despite the growing interest in NAC and UAV-based computing, existing studies rely heavily on numerical simulations for performance evaluation and lack realistic simulators and hardware testbeds. To fill this gap, this paper presents the development of two NAC platforms: a realistic simulator based on ROS and Gazebo, and a hardware testbed with multiple UAVs communicating and sharing computing resources. Through simulation and real flight tests with two computation applications, we evaluate the platforms and examine the impact of mobility on NAC performance. Our findings offer valuable insights into NAC and provide guidance for future advancements. 
    more » « less