skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Friday, December 13 until 2:00 AM ET on Saturday, December 14 due to maintenance. We apologize for the inconvenience.


Title: Progressive Neural Compression for Adaptive Image Offloading Under Timing Constraints
IoT devices are increasingly the source of data for machine learning (ML) applications running on edge servers. Data transmissions from devices to servers are often over local wireless networks whose bandwidth is not just limited but, more importantly, variable. Furthermore, in cyber-physical systems interacting with the physical environment, image offloading is also commonly subject to timing constraints. It is, therefore, important to develop an adaptive approach that maximizes the inference performance of ML applications under timing constraints and the resource constraints of IoT devices. In this paper, we use image classification as our target application and propose progressive neural compression (PNC) as an efficient solution to this problem. Although neural compression has been used to compress images for different ML applications, existing solutions often produce fixed-size outputs that are unsuitable for timing-constrained offloading over variable bandwidth. To address this limitation, we train a multi-objective rateless autoencoder that optimizes for multiple compression rates via stochastic taildrop to create a compression solution that produces features ordered according to their importance to inference performance. Features are then transmitted in that order based on available bandwidth, with classification ultimately performed using the (sub)set of features received by the deadline. We demonstrate the benefits of PNC over state-of-the-art neural compression approaches and traditional compression methods on a testbed comprising an IoT device and an edge server connected over a wireless network with varying bandwidth.  more » « less
Award ID(s):
2006530
PAR ID:
10552645
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
IEEE
Date Published:
ISBN:
979-8-3503-2857-8
Page Range / eLocation ID:
118 to 130
Format(s):
Medium: X
Location:
Taipei, Taiwan
Sponsoring Org:
National Science Foundation
More Like this
  1. Convolutional Neural Networks (CNN) have given rise to numerous visual analytics applications at the edge of the Internet. The image is typically captured by cameras and then live-streamed to edge servers for analytics due to the prohibitive cost of running CNN on computation-constrained end devices. A critical component to ensure low-latency and accurate visual analytics offloading over low bandwidth networks is image compression which minimizes the amount of visual data to offload and maximizes the decoding quality of salient pixels for analytics. Despite the wide adoption, JPEG standards and traditional image compression techniques do not address the accuracy of analytics tasks, leading to ineffective compression for visual analytics offloading. Although recent machine-centric image compression techniques leverage sophisticated neural network models or hardware architecture to support the accuracy-bandwidth trade-off, they introduce excessive latency in the visual analytics offloading pipeline. This paper presents CICO, a Context-aware Image Compression Optimization framework to achieve low-bandwidth and low-latency visual analytics offloading. CICO contextualizes image compression for offloading by employing easily-computable low-level image features to understand the importance of different image regions for a visual analytics task. Accordingly, CICO can optimize the trade-off between compression size and analytics accuracy. Extensive real-world experiments demonstrate that CICO reduces the bandwidth consumption of existing compression methods by up to 40% under comparable analytics accuracy. Regarding the low-latency support, CICO achieves up to a 2x speedup over state-of-the-art compression techniques.

     
    more » « less
  2. The past decade has witnessed the rising dominance of deep learning and artificial intelligence in a wide range of applications. In particular, the ocean of wireless smartphones and IoT devices continue to fuel the tremendous growth of edge/cloudbased machine learning (ML) systems including image/speech recognition and classification. To overcome the infrastructural barrier of limited network bandwidth in cloud ML, existing solutions have mainly relied on traditional compression codecs such as JPEG that were historically engineered for humanend users instead of ML algorithms. Traditional codecs do not necessarily preserve features important to ML algorithms under limited bandwidth, leading to potentially inferior performance. This work investigates application-driven optimization of programmable commercial codec settings for networked learning tasks such as image classification. Based on the foundation of variational autoencoders (VAEs), we develop an end-to-end networked learning framework by jointly optimizing the codec and classifier without reconstructing images for given data rate (bandwidth). Compared with standard JPEG codec, the proposed VAE joint compression and classification framework achieves classification accuracy improvement by over 10% and 4%, respectively, for CIFAR-10 and ImageNet-1k data sets at data rate of 0.8 bpp. Our proposed VAE-based models show 65%􀀀99% reductions in encoder size,  1.5􀀀 13.1 improvements in inference speed and 25%􀀀99% savings in power compared to baseline models. We further show that a simple decoder can reconstruct images with sufficient quality without compromising classification accuracy. 
    more » « less
  3. null (Ed.)
    Wireless charging coupled with computation offloading in edge networks offers a promising solution for realizing power-hungry and computation intensive applications on user devices. We consider a mutil-access edge computing (MEC) system with collocated MEC servers and base-stations/access points (BS/AP) supporting multiple users requesting data computation and wireless charging. We propose an integrated solution for wireless charging with computation offloading to satisfy the largest feasible proportion of requested wireless charging while keeping the total energy consumption at the minimum, subject to the MEC-AP transmit power and latency constraints. We propose a novel nested algorithm to jointly perform data partitioning, time allocation, transmit power control and design the optimal energy beamforming for wireless charging. Our resource allocation scheme offers a minimal energy consumption solution compared to other schemes while also delivering a higher amount of wirelessly transferred charge to the users. Even with data offloading, our proposed solution shows significant charging performance, comparable to the case of charging alone, hence showing the effectiveness of performing partial offloading jointly with wireless charging. 
    more » « less
  4. null (Ed.)
    With the explosion in Big Data, it is often forgotten that much of the data nowadays is generated at the edge. Specifically, a major source of data is users' endpoint devices like phones, smart watches, etc., that are connected to the internet, also known as the Internet-of-Things (IoT). This "edge of data" faces several new challenges related to hardware-constraints, privacy-aware learning, and distributed learning (both training as well as inference). So what systems and machine learning algorithms can we use to generate or exploit data at the edge? Can network science help us solve machine learning (ML) problems? Can IoT-devices help people who live with some form of disability and many others benefit from health monitoring? In this tutorial, we introduce the network science and ML techniques relevant to edge computing, discuss systems for ML (e.g., model compression, quantization, HW/SW co-design, etc.) and ML for systems design (e.g., run-time resource optimization, power management for training and inference on edge devices), and illustrate their impact in addressing concrete IoT applications. 
    more » « less
  5. null (Ed.)
    Wireless charging coupled with computation offloading in edge networks offers a promising solution for realizing power-hungry and computation intensive applications on user-devices. We consider a multi-access edge computing (MEC) system with collocated MEC server and base-station/access point (AP), each equipped with a massive MIMO antenna array, supporting multiple users requesting data computation and wireless charging. The goal is to minimize the energy consumption for computation offloading and maximize the received energy at the user from wireless charging. The proposed solution is a novel two-stage algorithm employing nested descent algorithm, primal-dual subgradient and linear programming techniques to perform data partitioning and time allocation for computation offloading and design the optimal energy beamforming for wireless charging, all within MEC-AP transmit power and latency constraints. Algorithm results show that optimal energy beamforming significantly outperforms other schemes such as isotropic or directed charging without beam power allocation. Compared to binary offloading, data partition in partial offloading leads to lower energy consumption and more charging time, leading to better wireless charging performance. The charged energy over an extended period of multiple time-slots both with and without computation offloading can be substantial. Wireless charging from MEC-AP thus offers a viable untethered approach for supplying energy to user-devices. 
    more » « less