skip to main content


Title: Flow-level Dynamic Bandwidth Allocation in SDN-enabled Edge Cloud using Heuristic Reinforcement Learning
Edge Cloud (EC) is poised to brace massive machine type communication (mMTC) for 5G and IoT by providing compute and network resources at the edge. Yet, the EC being regionally domestic with a smaller scale, faces the challenges of bandwidth and computational throughput. Resource management techniques are considered necessary to achieve efficient resource allocation objectives. Software Defined Network (SDN) enabled EC architecture is emerging as a potential solution that enables dynamic bandwidth allocation and task scheduling for latency sensitive and diverse mobile applications in the EC environment. This study proposes a novel Heuristic Reinforcement Learning (HRL) based flowlevel dynamic bandwidth allocation framework and validates it through end-to-end implementation using OpenFlow meter feature. OpenFlow meter provides granular control and allows demand-based flow management to meet the diverse QoS requirements germane to IoT traffics. The proposed framework is then evaluated by emulating an EC scenario based on real NSF COSMOS testbed topology at The City College of New York. A specific heuristic reinforcement learning with linear-annealing technique and a pruning principle are proposed and compared with the baseline approach. Our proposed strategy performs consistently in both Mininet and hardware OpenFlow switches based environments. The performance evaluation considers key metrics associated with real-time applications: throughput, end-to-end delay, packet loss rate, and overall system cost for bandwidth allocation. Furthermore, our proposed linear annealing method achieves faster convergence rate and better reward in terms of system cost, and the proposed pruning principle remarkably reduces control traffic in the network.  more » « less
Award ID(s):
1818884 2029295
NSF-PAR ID:
10289812
Author(s) / Creator(s):
Date Published:
Journal Name:
IEEE Conference, on Future Internet of Things and Cloud
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    To deploy powerful deep neural network (DNN) into smart, but resource limited IoT devices, many prior works have been proposed to compress DNN to reduce the network size and computation complexity with negligible accuracy degradation, such as weight quantization, network pruning, convolution decomposition, etc. However, by utilizing conventional DNN compression methods, a smaller, but fixed, network is generated from a relative large background model to achieve resource limited hardware acceleration. However, such optimization lacks the ability to adjust its structure in real-time to adapt for a dynamic computing hardware resource allocation and workloads. In this paper, we mainly review our two prior works [13], [15] to tackle this challenge, discussing how to construct a dynamic DNN by means of either uniform or non-uniform sub-nets generation methods. Moreover, to generate multiple non-uniform sub-nets, [15] needs to fully retrain the background model for each sub-net individually, named as multi-path method. To reduce the training cost, in this work, we further propose a single-path sub-nets generation method that can sample multiple sub-nets in different epochs within one training round. The constructed dynamic DNN, consisting of multiple sub-nets, provides the ability to run-time trade-off the inference accuracy and latency according to hardware resources and environment requirements. In the end, we study the the dynamic DNNs with different sub-nets generation methods on both CIFAR-10 and ImageNet dataset. We also present the run-time tuning of accuracy and latency on both GPU and CPU. 
    more » « less
  2. Abstract

    In current infrastructure-as-a service (IaaS) cloud services, customers are charged for the usage of computing/storage resources only, but not the network resource. The difficulty lies in the fact that it is nontrivial to allocate network resource to individual customers effectively, especially for short-lived flows, in terms of both performance and cost, due to highly dynamic environments by flows generated by all customers. To tackle this challenge, in this paper, we propose an end-to-end Price-Aware Congestion Control Protocol (PACCP) for cloud services. PACCP is a network utility maximization (NUM) based optimal congestion control protocol. It supports three different classes of services (CoSes), i.e., best effort service (BE), differentiated service (DS), and minimum rate guaranteed (MRG) service. In PACCP, the desired CoS or rate allocation for a given flow is enabled by properly setting a pair of control parameters, i.e., a minimum guaranteed rate and a utility weight, which in turn, determines the price paid by the user of the flow. Two pricing models, i.e., a coarse-grained VM-Based Pricing model (VBP) and a fine-grained Flow-Based Pricing model (FBP), are proposed. The optimality of PACCP is verified by both large scale simulation and small testbed implementation. The price-performance consistency of PACCP are evaluated using real datacenter workloads. The results demonstrate that PACCP provides minimum rate guarantee, high bandwidth utilization and fair rate allocation, commensurate with the pricing models.

     
    more » « less
  3. The rapid growth of mobile data traffic is straining cellular networks. A natural approach to alleviate cellular networks congestion is to use, in addition to the cellular interface, secondary interfaces such as WiFi, Dynamic spectrum and mmWave to aid cellular networks in handling mobile traffic. The fundamental question now becomes: How should traffic be distributed over different interfaces, taking into account different application QoS requirements and the diverse nature of radio interfaces. To this end, we propose the Discounted Rate Utility Maximization (DRUM) framework with interface costs as a means to quantify application preferences in terms of throughput, delay, and cost. The flow rate allocation problem can be formulated as a convex optimization problem. However, solving this problem requires non-causal knowledge of the time-varying capacities of all radio interfaces. To this end, we propose an online predictive algorithm that exploits the predictability of wireless connectivity for a small look-ahead window w. We show that, under some mild conditions, the proposed algorithm achieves a constant competitive ratio independent of the time horizon T. Furthermore, the competitive ratio approaches 1 as the prediction window increases. We also propose another predictive algorithm based on the "Receding Horizon Control" principle from control theory that performs very well in practice. Numerical simulations serve to validate our formulation, by showing that under the DRUM framework: the more delay-tolerant the flow, the less it uses the cellular network, preferring to transmit in high rate bursts over the secondary interfaces. Conversely, delay-sensitive flows consistently transmit irrespective of different interfaces' availability. Simulations also show that the proposed online predictive algorithms have a near-optimal performance compared to the offline prescient solution under all considered scenarios. 
    more » « less
  4. Systems for Internet of Things (IoT) have generated new requirements in all aspects of their development and deployment, including expanded Quality of Service (QoS) needs, enhanced resiliency of computing and connectivity, and the scalability to support massive numbers of end devices in a variety of applications. The research reported here concerns the development of a reliable and secure IoT/cyber physical system (CPS), providing network support for smart and connected communities, to be realized by means of distributed, secure, resilient Edge Cloud (EC) computing. This distributed EC system will be a network of geographically distributed EC nodes, brokering between end-devices and Backend Cloud (BC) servers. This paper focuses on three main aspects of the CPS: a) resource management in mobile cloud computing; b) information management in dynamic distributed databases; and c) biological-inspired intrusion detection system. 
    more » « less
  5. 5G services and applications explicitly reserve compute and network resources in today’s complex and dynamic infrastructure of multi-tiered computing and cellular networking to ensure application-specific service quality metrics, and the infrastructure providers charge the 5G services for the resources reserved. A static, one-time reservation of resources at service deployment typically results in extended periods of under-utilization of reserved resources during the lifetime of the service operation. This is due to a plethora of reasons like changes in content from the IoT sensors (for example, change in number of people in the field of view of a camera) or a change in the environmental conditions around the IoT sensors (for example, time of the day, rain or fog can affect data acquisition by sensors). Under-utilization of a specific resource like compute can also be due to temporary inadequate availability of another resource like the network bandwidth in a dynamic 5G infrastructure. We propose a novel Reinforcement Learning-based online method to dynamically adjust an application’s compute and network resource reservations to minimize under-utilization of requested resources, while ensuring acceptable service quality metrics. We observe that a complex application-specific coupling exists between the compute and network usage of an application. Our proposed method learns this coupling during the operation of the service, and dynamically modulates the compute and network resource requests to minimize under-utilization of reserved resources. Through experimental evaluation using real-world video analytics application, we show that our technique is able to capture complex compute-network coupling relationship in an online manner i.e. while the application is running, and dynamically adapts and saves up to 65% compute and 93% network resources on average (over multiple runs), without significantly impacting application accuracy. 
    more » « less