skip to main content


Title: Service Placement for Real-Time Applications: Rate-Adaptation and Load-Balancing at the Network Edge
Mobile Edge Computing may become a prevalent platform to support applications where mobile devices have limited compute, storage, energy and/or data privacy concerns. In this paper, we study the efficient provisioning and man- agement of compute resources in the Edge-to-Cloud continuum for different types of real-time applications with timeliness requirements depending on application-level update rates and communication/compute delays. We begin by introducing a highly stylized network model allowing us to study the salient features of this problem including its sensitivity to compute vs. communication costs, application requirements, and traffic load variability. We then propose an online decentralized service placement algorithm, based on estimating network delays and adapting application update rates, which achieves high service availability. Our results exhibit how placement can be optimized and how a load-balancing strategy c  more » « less
Award ID(s):
1809327
NSF-PAR ID:
10168363
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
International Conference on Edge Computing
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Next-generation mobile networks (5G and beyond) are expected to provide higher data rates and ultra-low latency in support of demanding applications, such as virtual and augmented reality, robots and drones, etc. To meet these stringent requirements, edge computing constitutes a central piece of the solution architecture wherein functional components of an application can be deployed over the edge network so as to reduce bandwidth demand over the core network while providing ultra-low latency communication to users. In this paper, we investigate the joint optimal placement of virtual service chains consisting of virtual application functions (components) and the steering of traffic through them, over a 5G multi-technology edge network model consisting of both Ethernet and mmWave links. This problem is NP-hard. We provide a comprehensive “microscopic" binary integer program to model the system, along with a heuristic that is one order of magnitude faster than solving the corresponding binary integer program. Extensive evaluations demonstrate the benefits of managing virtual service chains (by distributing them over the edge network) compared to a baseline “middlebox" approach in terms of overall admissible virtual capacity. We observe significant gains when deploying mmWave links that complement the Ethernet physical infrastructure. Moreover, most of the gains are attributed to only 30% of these mmWave links. 
    more » « less
  2. null (Ed.)
    In this paper, we analyze the performance of Multiplayer Cloud Gaming (MCG) systems. To that end, we introduce a model and new MCG-Quality of Service (QoS) metric that captures the freshness of the players’ updates and fairness in their gaming experience. We introduce an efficient measurement-based Joint Multiplayer Rate Adaptation (JMRA) algorithm that optimizes the MCG-QoS by overcoming large (possibly varying) network transport delays by increasing the associated players’ update rates. The resulting MCG- QoS is shown to be Schur-concave in the network delays, leading to natural characterizations and performance comparisons associated with the players’ spatial geometry and network congestion. In particular, joint rate adaptation enables service providers to combat variability in network delays and players’ geographic spread to achieve high service coverage. This, in turn, allows us to explore the spatial density and capacity of compute resources that need to be provisioned. Finally, we leverage tools from majorization theory, to show how service placement decisions can be made to improve the robustness of the MCG-QoS to stochastic network delays. 
    more » « less
  3. In this paper, we consider a large-scale heterogeneous mobile edge computing system, where each device’s mean computing task arrival rate, mean service rate, mean energy consumption, and mean offloading latency are drawn from different bounded continuous probability distributions to reflect the diverse compute-intensive applications, mobile devices with different computing capabilities and battery efficiencies, and different types of wireless access networks (e.g., 4G/5G cellular networks, WiFi). We consider a class of distributed threshold-based randomized offloading policies and develop a threshold update algorithm based on its computational load, average offloading latency, average energy consumption, and edge server processing time, depending on the server utilization. We show that there always exists a unique Mean-Field Nash Equilibrium (MFNE) in the large-system limit when the task processing times of mobile devices follow an exponential distribution. This is achieved by carefully partitioning the space of mean arrival rates to account for the discrete structure of each device’s optimal threshold. Moreover, we show that our proposed threshold update algorithm converges to the MFNE. Finally, we perform simulations to corroborate our theoretical results and demonstrate that our proposed algorithm still performs well in more general setups based on the collected real-world data and outperforms the well-known probabilistic offloading policy. 
    more » « less
  4. null (Ed.)
    Edge computing is an attractive architecture to efficiently provide compute resources to many applications that demand specific QoS requirements. The edge compute resources are in close geographical proximity to where the applications’ data originate from and/or are being supplied to, thus avoiding unnecessary back and forth data transmission with a data center far away. This paper describes a federated edge computing system in which compute resources at multiple edge sites are dynamically aggregated together to form distributed super-cloudlets and best respond to varying application-driven loads. In its simplest form a super-cloudlet consists of compute resources available at two edge computing sites or cloudlets that are (temporarily) interconnected by dedicated optical circuits deployed to enable low-latency and high-rate data exchanges. A super-cloudlet architecture is experimentally demonstrated over the largest public OpenROADM optical network testbed up to date consisting of commercial equipment from six suppliers. The software defined networking (SDN) PROnet Orchestrator is upgraded to both concurrently manage the resources offered by the optical network equipment, compute nodes, and associated Ethernet switches and achieve three key functionalities of the proposed super-cloudlet architecture, i.e., service placement, auto-scaling, and offloading. 
    more » « less
  5. The proliferation of innovative mobile services such as augmented reality, networked gaming, and autonomous driving has spurred a growing need for low-latency access to computing resources that cannot be met solely by existing centralized cloud systems. Mobile Edge Computing (MEC) is expected to be an effective solution to meet the demand for low-latency services by enabling the execution of computing tasks at the network-periphery, in proximity to end-users. While a number of recent studies have addressed the problem of determining the execution of service tasks and the routing of user requests to corresponding edge servers, the focus has primarily been on the efficient utilization of computing resources, neglecting the fact that non-trivial amounts of data need to be stored to enable service execution, and that many emerging services exhibit asymmetric bandwidth requirements. To fill this gap, we study the joint optimization of service placement and request routing in MEC-enabled multi-cell networks with multidimensional (storage-computation-communication) constraints. We show that this problem generalizes several problems in literature and propose an algorithm that achieves close-to-optimal performance using randomized rounding. Evaluation results demonstrate that our approach can effectively utilize the available resources to maximize the number of requests served by low-latency edge cloud servers. 
    more » « less