skip to main content


Title: Joint Service Placement and Request Routing in Multi-cell Mobile Edge Computing Networks
The proliferation of innovative mobile services such as augmented reality, networked gaming, and autonomous driving has spurred a growing need for low-latency access to computing resources that cannot be met solely by existing centralized cloud systems. Mobile Edge Computing (MEC) is expected to be an effective solution to meet the demand for low-latency services by enabling the execution of computing tasks at the network-periphery, in proximity to end-users. While a number of recent studies have addressed the problem of determining the execution of service tasks and the routing of user requests to corresponding edge servers, the focus has primarily been on the efficient utilization of computing resources, neglecting the fact that non-trivial amounts of data need to be stored to enable service execution, and that many emerging services exhibit asymmetric bandwidth requirements. To fill this gap, we study the joint optimization of service placement and request routing in MEC-enabled multi-cell networks with multidimensional (storage-computation-communication) constraints. We show that this problem generalizes several problems in literature and propose an algorithm that achieves close-to-optimal performance using randomized rounding. Evaluation results demonstrate that our approach can effectively utilize the available resources to maximize the number of requests served by low-latency edge cloud servers.  more » « less
Award ID(s):
1815676
NSF-PAR ID:
10121142
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
IEEE International Conference on Computer Communications
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Next-generation distributed computing networks (e.g., edge and fog computing) enable the efficient delivery of delay-sensitive, compute-intensive applications by facilitating access to computation resources in close proximity to end users. Many of these applications (e.g., augmented/virtual reality) are also data-intensive: in addition to user-specific (live) data streams, they require access to shared (static) digital objects (e.g., im-age database) to complete the required processing tasks. When required objects are not available at the servers hosting the associated service functions, they must be fetched from other edge locations, incurring additional communication cost and latency. In such settings, overall service delivery performance shall benefit from jointly optimized decisions around (i) routing paths and processing locations for live data streams, together with (ii) cache selection and distribution paths for associated digital objects. In this paper, we address the problem of dynamic control of data-intensive services over edge cloud networks. We characterize the network stability region and design the first throughput-optimal control policy that coordinates processing and routing decisions for both live and static data-streams. Numerical results demonstrate the superior performance (e.g., throughput, delay, and resource consumption) obtained via the novel multi-pipeline flow control mechanism of the proposed policy, compared with state-of-the-art algorithms that lack integrated stream processing and data distribution control. 
    more » « less
  2. Mobile edge computing (MEC) is an emerging paradigm that integrates computing resources in wireless access networks to process computational tasks in close proximity to mobile users with low latency. In this paper, we propose an online double deep Q networks (DDQN) based learning scheme for task assignment in dynamic MEC networks, which enables multiple distributed edge nodes and a cloud data center to jointly process user tasks to achieve optimal long-term quality of service (QoS). The proposed scheme captures a wide range of dynamic network parameters including non-stationary node computing capabilities, network delay statistics, and task arrivals. It learns the optimal task assignment policy with no assumption on the knowledge of the underlying dynamics. In addition, the proposed algorithm accounts for both performance and complexity, and addresses the state and action space explosion problem in conventional Q learning. The evaluation results show that the proposed DDQN-based task assignment scheme significantly improves the QoS performance, compared to the existing schemes that do not consider the effects of network dynamics on the expected long-term rewards, while scaling reasonably well as the network size increases. 
    more » « less
  3. With the explosion of intelligent and latency-sensitive applications such as AR/VR, remote health and autonomous driving, mobile edge computing (MEC) has emerged as a promising solution to mitigate the high end-to-end latency of mobile cloud computing (MCC). However, the edge servers have significantly less computing capability compared to the resourceful central cloud. Therefore, a collaborative cloud-edge-local offloading scheme is necessary to accommodate both computationally intensive and latency-sensitive mobile applications. The coexistence of central cloud, edge servers and the mobile device (MD), forming a multi-tiered heterogeneous architecture, makes the optimal application deployment very challenging especially for multi-component applications with component dependencies. This paper addresses the problem of energy and latency efficient application offloading in a collaborative cloud-edgelocal environment. We formulate a multi-objective mixed integer linear program (MILP) with the goal of minimizing the systemwide energy consumption and application end-to-end latency. An approximation algorithm based on LP relaxation and rounding is proposed to address the time complexity. We demonstrate that our approach outperforms existing strategies in terms of application request acceptance ratio, latency and system energy consumption. 
    more » « less
  4. The increased use of micro-services to build web applications has spurred the rapid growth of Function-as-a-Service (FaaS) or serverless computing platforms. While FaaS simplifies provisioning and scaling for application developers, it introduces new challenges in resource management that need to be handled by the cloud provider. Our analysis of popular serverless workloads indicates that schedulers need to handle functions that are very short-lived, have unpredictable arrival patterns, and require expensive setup of sandboxes. The challenge of running a large number of such functions in a multi-tenant cluster makes existing scheduling frameworks unsuitable. We present Archipelago, a platform that enables low latency request execution in a multi-tenant serverless setting. Archipelago views each application as a DAG of functions, and every DAG in associated with a latency deadline. Archipelago achieves its per-DAG request latency goals by: (1) partitioning a given cluster into a number of smaller worker pools, and associating each pool with a semi-global scheduler (SGS), (2) using a latency-aware scheduler within each SGS along with proactive sandbox allocation to reduce overheads, and (3) using a load balancing layer to route requests for different DAGs to the appropriate SGS, and automatically scale the number of SGSs per DAG. Our testbed results show that Archipelago meets the latency deadline for more than 99% of realistic application request workloads, and reduces tail latencies by up to 36X compared to state-of-the-art serverless platforms. 
    more » « less
  5. Task offloading in edge computing infrastructure remains a challenge for dynamic and complex environments, such as Industrial Internet-of-Things. The hardware resource constraints of edge servers must be explicitly considered to ensure that system resources are not overloaded. Many works have studied task offloading while focusing primarily on ensuring system resilience. However, in the face of deep learning-based services, model performance with respect to loss/accuracy must also be considered. Deep learning services with different implementations may provide varying amounts of loss/accuracy while also being more complex to run inference on. That said, communication latency can be reduced to improve overall Quality-of-Service by employing compression techniques. However, such techniques can also have the side-effect of reducing the loss/accuracy provided by deep learning-based service. As such, this work studies a joint optimization problem for task offloading decisions in 3-tier edge computing platforms where decisions regarding task offloading are made in tandem with compression decisions. The objective is to optimally offload requests with compression such that the trade-off between latency-accuracy is not greatly jeopardized. We cast this problem as a mixed integer nonlinear program. Due to its nonlinear nature, we then decompose it into separate subproblems for offloading and compression. An efficient algorithm is proposed to solve the problem. Empirically, we show that our algorithm attains roughly a 0.958-approximation of the optimal solution provided by a block coordinate descent method for solving the two sub-problems back-to-back. 
    more » « less