skip to main content

Title: Data Sharing-Aware Task Allocation in Edge Computing Systems
Edge computing allows end-user devices to offload heavy computation to nearby edge servers for reduced latency, maximized profit, and/or minimized energy consumption. Data-dependent tasks that analyze locally-acquired sensing data are one of the most common candidates for task offloading in edge computing. As a result, the total latency and network load are affected by the total amount of data transferred from end-user devices to the selected edge servers. Most existing solutions for task allocation in edge computing do not take into consideration that some user tasks may actually operate on the same data items. Making the task allocation algorithm aware of the existing data sharing characteristics of tasks can help reduce network load at a negligible profit loss by allocating more tasks sharing data on the same server. In this paper, we formulate the data sharing-aware task allocation problem that make decisions on task allocation for maximized profit and minimized network load by taking into account the data-sharing characteristics of tasks. In addition, because the problem is NP-hard, we design the DSTA algorithm, which finds a solution to the problem in polynomial time. We analyze the performance of the proposed algorithm against a state-of-the-art baseline that only maximizes profit. Our more » extensive analysis shows that DSTA leads to about 8 times lower data load on the network while being within 1.03 times of the total profit on average compared to the state-of-the-art. « less
Authors:
; ; ;
Award ID(s):
2118202
Publication Date:
NSF-PAR ID:
10329286
Journal Name:
2021 IEEE International Conference on Edge Computing (EDGE)
Page Range or eLocation-ID:
60 to 67
Sponsoring Org:
National Science Foundation
More Like this
  1. Edge cloud data centers (Edge) are deployed to provide responsive services to the end-users. Edge can host more powerful CPUs and DNN accelerators such as GPUs and may be used for offloading tasks from end-user devices that require more significant compute capabilities. But Edge resources may also be limited and must be shared across multiple applications that process requests concurrently from several clients. However, multiplexing GPUs across applications is challenging. With edge cloud servers needing to process a lot of streaming and the advent of multi-GPU systems, getting that data from the network to the GPU can be a bottleneck, limiting the amount of work the GPU cluster can do. The lack of prompt notification of job completion from the GPU can also result in poor GPU utilization. We build on our recent work on controlled spatial sharing of a single GPU to expand to support multi-GPU systems and propose a framework that addresses these challenges. Unlike the state-of-the-art uncontrolled spatial sharing currently available with systems such as CUDA-MPS, our controlled spatial sharing approach uses each of the GPU in the cluster efficiently by removing interference between applications, resulting in much better, predictable, inference latency We also use each ofmore »the cluster GPU's DMA engines to offload data transfers to the GPU complex, thereby preventing the CPU from being the bottleneck. Finally, our framework uses the CUDA event library to give timely, low overhead GPU notifications. Our evaluations show we can achieve low DNN inference latency and improve DNN inference throughput by at least a factor of 2.« less
  2. The proliferation of innovative mobile services such as augmented reality, networked gaming, and autonomous driving has spurred a growing need for low-latency access to computing resources that cannot be met solely by existing centralized cloud systems. Mobile Edge Computing (MEC) is expected to be an effective solution to meet the demand for low-latency services by enabling the execution of computing tasks at the network-periphery, in proximity to end-users. While a number of recent studies have addressed the problem of determining the execution of service tasks and the routing of user requests to corresponding edge servers, the focus has primarily been on the efficient utilization of computing resources, neglecting the fact that non-trivial amounts of data need to be stored to enable service execution, and that many emerging services exhibit asymmetric bandwidth requirements. To fill this gap, we study the joint optimization of service placement and request routing in MEC-enabled multi-cell networks with multidimensional (storage-computation-communication) constraints. We show that this problem generalizes several problems in literature and propose an algorithm that achieves close-to-optimal performance using randomized rounding. Evaluation results demonstrate that our approach can effectively utilize the available resources to maximize the number of requests served by low-latency edge cloud servers.
  3. Unmanned Aerial Vehicle (UAV)-assisted Multi-access Edge Computing (MEC) systems have emerged recently as a flexible and dynamic computing environment, providing task offloading service to the users. In order for such a paradigm to be viable, the operator of a UAV-mounted MEC server should enjoy some form of profit by offering its computing capabilities to the end users. To deal with this issue in this paper, we apply a usage-based pricing policy for allowing the exploitation of the servers’ computing resources. The proposed pricing mechanism implicitly introduces a more social behavior to the users with respect to competing for the UAV-mounted MEC servers’ computation resources. In order to properly model the users’ risk-aware behavior within the overall data offloading decision-making process the principles of Prospect Theory are adopted, while the exploitation of the available computation resources is considered based on the theory of the Tragedy of the Commons. Initially, the user’s prospect-theoretic utility function is formulated by quantifying the user’s risk seeking and loss aversion behavior, while taking into account the pricing mechanism. Accordingly, the users’ pricing and risk-aware data offloading problem is formulated as a distributed maximization problem of each user’s expected prospect-theoretic utility function and addressed as a non-cooperativemore »game among the users. The existence of a Pure Nash Equilibrium (PNE) for the formulated non-cooperative game is shown based on the theory of submodular games. An iterative and distributed algorithm is introduced which converges to the PNE, following the learning rule of the best response dynamics. The performance evaluation of the proposed approach is achieved via modeling and simulation, and detailed numerical results are presented highlighting its key operation features and benefits.« less
  4. Next-generation distributed computing networks (e.g., edge and fog computing) enable the efficient delivery of delay-sensitive, compute-intensive applications by facilitating access to computation resources in close proximity to end users. Many of these applications (e.g., augmented/virtual reality) are also data-intensive: in addition to user-specific (live) data streams, they require access to shared (static) digital objects (e.g., im-age database) to complete the required processing tasks. When required objects are not available at the servers hosting the associated service functions, they must be fetched from other edge locations, incurring additional communication cost and latency. In such settings, overall service delivery performance shall benefit from jointly optimized decisions around (i) routing paths and processing locations for live data streams, together with (ii) cache selection and distribution paths for associated digital objects. In this paper, we address the problem of dynamic control of data-intensive services over edge cloud networks. We characterize the network stability region and design the first throughput-optimal control policy that coordinates processing and routing decisions for both live and static data-streams. Numerical results demonstrate the superior performance (e.g., throughput, delay, and resource consumption) obtained via the novel multi-pipeline flow control mechanism of the proposed policy, compared with state-of-the-art algorithms that lack integratedmore »stream processing and data distribution control.« less
  5. Edge clouds can provide very responsive services for end-user devices that require more significant compute capabilities than they have. But edge cloud resources such as CPUs and accelerators such as GPUs are limited and must be shared across multiple concurrently running clients. However, multiplexing GPUs across applications is challenging. Further, edge servers are likely to require considerable amounts of streaming data to be processed. Getting that data from the network stream to the GPU can be a bottleneck, limiting the amount of work GPUs do. Finally, the lack of prompt notification of job completion from GPU also results in ineffective GPU utilization. We propose a framework that addresses these challenges in the following manner. We utilize spatial sharing of GPUs to multiplex the GPU more efficiently. While spatial sharing of GPU can increase GPU utilization, the uncontrolled spatial sharing currently available with state-of-the-art systems such as CUDA-MPS can cause interference between applications, resulting in unpredictable latency. Our framework utilizes controlled spatial sharing of GPU, which limits the interference across applications. Our framework uses the GPU DMA engine to offload data transfer to GPU, therefore preventing CPU from being bottleneck while transferring data from the network to GPU. Our framework usesmore »the CUDA event library to have timely, low overhead GPU notifications. Preliminary experiments show that we can achieve low DNN inference latency and improve DNN inference throughput by a factor of ∼1.4.« less