skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Offload Shaping for Wearable Cognitive Assistance
Edge computing has much lower elasticity than cloud computing because cloudlets have much smaller physical and electrical footprints than a data center. This hurts the scalability of applications that involve low-latency edge offload. We show how this problem can be addressed by leveraging the growing sophistication and compute capability of recent wearable devices. We investigate four Wearable Cognitive Assistance applications on three wearable devices, and show that the technique of offload shaping can significantly reduce network utilization and cloudlet load without compromising accuracy or performance. Our investigation considers the offload shaping strategies of mapping processes to different computing tiers, gating, and decluttering. We find that all three strategies offer a significant bandwidth savings compared to transmitting full camera images to a cloudlet. Two out of the three devices we test are capable of running all offload shaping strategies within a reasonable latency bound.  more » « less
Award ID(s):
2106862
PAR ID:
10589007
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
MDPI
Date Published:
Journal Name:
Electronics
Volume:
13
Issue:
20
ISSN:
2079-9292
Page Range / eLocation ID:
4083
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Edge computing is an attractive architecture to efficiently provide compute resources to many applications that demand specific QoS requirements. The edge compute resources are in close geographical proximity to where the applications’ data originate from and/or are being supplied to, thus avoiding unnecessary back and forth data transmission with a data center far away. This paper describes a federated edge computing system in which compute resources at multiple edge sites are dynamically aggregated together to form distributed super-cloudlets and best respond to varying application-driven loads. In its simplest form a super-cloudlet consists of compute resources available at two edge computing sites or cloudlets that are (temporarily) interconnected by dedicated optical circuits deployed to enable low-latency and high-rate data exchanges. A super-cloudlet architecture is experimentally demonstrated over the largest public OpenROADM optical network testbed up to date consisting of commercial equipment from six suppliers. The software defined networking (SDN) PROnet Orchestrator is upgraded to both concurrently manage the resources offered by the optical network equipment, compute nodes, and associated Ethernet switches and achieve three key functionalities of the proposed super-cloudlet architecture, i.e., service placement, auto-scaling, and offloading. 
    more » « less
  2. We present a benchmark-driven experimental study of autonomous drone agility relative to edge offload pipeline attributes. This pipeline includes a monocular gimbal-actuated on-drone camera, hardware RTSP video encoding, 4G LTE wireless network transmission, and computer vision processing on a ground-based GPU-equipped cloudlet. Our parameterized and reproducible agility benchmarks stress the OODA (“Observe, Orient, Decide, Act”) loop of the drone on obstacle avoidance and object tracking tasks. We characterize the latency and throughput of components of this OODA loop through software profiling, and identify opportunities for optimization. 
    more » « less
  3. null (Ed.)
    In today's era of Internet of Things (IoT), where massive amounts of data are produced by IoT and other devices, edge computing has emerged as a prominent paradigm for low-latency data processing. However, applications may have diverse latency requirements: certain latency-sensitive processing operations may need to be performed at the edge, while delay-tolerant operations can be performed on the cloud, without occupying the potentially limited edge computing resources. To achieve that, we envision an environment where computing resources are distributed across edge and cloud offerings. In this paper, we present the design of CLEDGE (CLoud + EDGE), an information-centric hybrid cloud-edge framework, aiming to maximize the on-time completion of computational tasks offloaded by applications with diverse latency requirements. The design of CLEDGE is motivated by the networking challenges that mixed reality researchers face. Our evaluation demonstrates that CLEDGE can complete on-time more than 90% of offloaded tasks with modest overheads. 
    more » « less
  4. null (Ed.)
    Edge cloud data centers (Edge) are deployed to provide responsive services to the end-users. Edge can host more powerful CPUs and DNN accelerators such as GPUs and may be used for offloading tasks from end-user devices that require more significant compute capabilities. But Edge resources may also be limited and must be shared across multiple applications that process requests concurrently from several clients. However, multiplexing GPUs across applications is challenging. With edge cloud servers needing to process a lot of streaming and the advent of multi-GPU systems, getting that data from the network to the GPU can be a bottleneck, limiting the amount of work the GPU cluster can do. The lack of prompt notification of job completion from the GPU can also result in poor GPU utilization. We build on our recent work on controlled spatial sharing of a single GPU to expand to support multi-GPU systems and propose a framework that addresses these challenges. Unlike the state-of-the-art uncontrolled spatial sharing currently available with systems such as CUDA-MPS, our controlled spatial sharing approach uses each of the GPU in the cluster efficiently by removing interference between applications, resulting in much better, predictable, inference latency We also use each of the cluster GPU's DMA engines to offload data transfers to the GPU complex, thereby preventing the CPU from being the bottleneck. Finally, our framework uses the CUDA event library to give timely, low overhead GPU notifications. Our evaluations show we can achieve low DNN inference latency and improve DNN inference throughput by at least a factor of 2. 
    more » « less
  5. Edge computing allows end-user devices to offload heavy computation to nearby edge servers for reduced latency, maximized profit, and/or minimized energy consumption. Data-dependent tasks that analyze locally-acquired sensing data are one of the most common candidates for task offloading in edge computing. As a result, the total latency and network load are affected by the total amount of data transferred from end-user devices to the selected edge servers. Most existing solutions for task allocation in edge computing do not take into consideration that some user tasks may actually operate on the same data items. Making the task allocation algorithm aware of the existing data sharing characteristics of tasks can help reduce network load at a negligible profit loss by allocating more tasks sharing data on the same server. In this paper, we formulate the data sharing-aware task allocation problem that make decisions on task allocation for maximized profit and minimized network load by taking into account the data-sharing characteristics of tasks. In addition, because the problem is NP-hard, we design the DSTA algorithm, which finds a solution to the problem in polynomial time. We analyze the performance of the proposed algorithm against a state-of-the-art baseline that only maximizes profit. Our extensive analysis shows that DSTA leads to about 8 times lower data load on the network while being within 1.03 times of the total profit on average compared to the state-of-the-art. 
    more » « less