skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Application-specific, Dynamic Reservation of 5G Compute and Network Resources by Using Reinforcement Learning
5G services and applications explicitly reserve compute and network resources in today’s complex and dynamic infrastructure of multi-tiered computing and cellular networking to ensure application-specific service quality metrics, and the infrastructure providers charge the 5G services for the resources reserved. A static, one-time reservation of resources at service deployment typically results in extended periods of under-utilization of reserved resources during the lifetime of the service operation. This is due to a plethora of reasons like changes in content from the IoT sensors (for example, change in number of people in the field of view of a camera) or a change in the environmental conditions around the IoT sensors (for example, time of the day, rain or fog can affect data acquisition by sensors). Under-utilization of a specific resource like compute can also be due to temporary inadequate availability of another resource like the network bandwidth in a dynamic 5G infrastructure. We propose a novel Reinforcement Learning-based online method to dynamically adjust an application’s compute and network resource reservations to minimize under-utilization of requested resources, while ensuring acceptable service quality metrics. We observe that a complex application-specific coupling exists between the compute and network usage of an application. Our proposed method learns this coupling during the operation of the service, and dynamically modulates the compute and network resource requests to minimize under-utilization of reserved resources. Through experimental evaluation using real-world video analytics application, we show that our technique is able to capture complex compute-network coupling relationship in an online manner i.e. while the application is running, and dynamically adapts and saves up to 65% compute and 93% network resources on average (over multiple runs), without significantly impacting application accuracy.  more » « less
Award ID(s):
2127605
PAR ID:
10381008
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
Proceedings ACM SIGCOMM 2022 Workshop on Network-Application Integration (NAI'22)
Page Range / eLocation ID:
19-25
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Distributed networked systems form an essential resource for computation and applications ranging from commercial, military, scientific, and research communities. Allocation of resources on a given infrastructure is realized through various mapping systems that are tailored towards specific use cases of the requesting applications. While HPC system requests demand compute resources heavy on processor and memory, cloud applications may demand distributed web services that are composed of networked processing and some memory. All resource requests allocate on the infrastructure with some form of network connectivity. However, during mapping of resources, the features and topology constraints of network components are typically handled indirectly through abstractions of user requests. This paper is on a novel graph representation that enables precise mapping methods for distributed networked systems. The proposed graph representations are demonstrated to allocate specific network components and adjacency requirements of a requested graph on a given infrastructure. Furthermore, we report on application of business policy requirements that resulted in increased utilization and a gradual decrease in idle node count as requests are mapped using our proposed methods. 
    more » « less
  2. Modern fifth-generation (5G) networks are increasingly moving towards architectures characterized by softwarization and virtualization. This paper addresses the complexities and challenges in deploying applications and services in the emerging multi-tiered 5G network architecture, particularly in the context of microservices-based applications. These applications, characterized by their structure as directed graphs of interdependent functions, are sensitive to the deployment tiers and resource allocation strategies, which can result in performance degradation and susceptibility to failures. Additionally, the threat of deploying potentially malicious applications exacerbates resource allocation inefficiencies. To address these issues, we propose a novel optimization framework that incorporates a probabilistic approach for assessing the risk of malicious applications, leading to a more resilient resource allocation strategy. Our framework dynamically optimizes both computational and networking resources across various tiers, aiming to enhance key performance metrics such as latency, accuracy, and resource utilization. Through detailed simulations, we demonstrate that our framework not only satisfies strict performance requirements but also surpasses existing methods in efficiency and security. 
    more » « less
  3. Cloud virtualization and multi-tenant networking provide Infrastructure as a Service (IaaS) providers a new and innovative way to offer on-demand services to their customers, such as easy provisioning of new applications and better resource efficiency and scalability. However, existing data-intensive intelligent applications require more powerful processors, higher bandwidth and lower-latency networking service. In order to boost the performance of computing and networking services, as well as reduce the overhead of software virtualization, we propose a new data center network design based on OpenStack. Specifically, we map the OpenStack networking services to the hardware switch and utilize hardware-accelerated L2 switch and L3 routing to solve the software limitations, as well as achieve software-like scalability and flexibility. We design our prototype system via the Arista Software-Defined-Networking (SDN) switch and provide an automatic script which abstracts the service layer that decouples OpenStack from the physical network infrastructure, thereby providing vendor-independence. We have evaluated the performance improvement in terms of bandwidth, delay, and system resource utilization using various tools and under various Quality-of-Service (QoS) constraints. Our solution demonstrates improved cloud scaling and network efficiency via only one touch point to control all vendors' devices in the data center. 
    more » « less
  4. Internet of Things (IoT) is becoming increasingly popular due to its ability to connect machines and enable an ecosystem for new applications and use cases. One such use case is industrial loT (1IoT) that refers to the application of loT in industrial settings especially engaging instrumentation and control of sensors and machines with Cloud technologies. Industries are counting on the fifth generation (5G) of mobile communications to provide seamless, ubiquitous and flexible connectivity among machines, people and sensors. The open radio access network (O-RAN) architecture adds additional interfaces and RAN intelligent controllers that can be leveraged to meet the IIoT service requirements. In this paper, we examine the connectivity requirements for IIoT that are dominated by two industrial applications: control and monitoring. We present the strength, weakness, opportunity, and threat (SWOT) analysis of O-RAN for IIoT and provide a use case example which illustrates how O-RAN can support diverse and changing IIoT network services. We conclude that the flexibility of the O-RAN architecture, which supports the latest cellular network standards and services, provides a path forward for next generation IIoT network design, deployment, customization, and maintenance. It offers more control but still lacks products-hardware and software-that are exhaustively tested in production like environments. 
    more » « less
  5. Compute heterogeneity is increasingly gaining prominence in modern datacenters due to the addition of accelerators like GPUs and FPGAs. We observe that datacenter schedulers are agnostic of these emerging accelerators, especially their resource utilization footprints, and thus, not well equipped to dynamically provision them based on the application needs. We observe that the state-of-the-art datacenter schedulers fail to provide fine-grained resource guarantees for latency-sensitive tasks that are GPU-bound. Specifically for GPUs, this results in resource fragmentation and interference leading to poor utilization of allocated GPU resources. Furthermore, GPUs exhibit highly linear energy efficiency with respect to utilization and hence proactive management of these resources is essential to keep the operational costs low while ensuring the end-to-end Quality of Service (QoS) in case of user-facing queries.Towards addressing the GPU orchestration problem, we build Knots, a GPU-aware resource orchestration layer and integrate it with the Kubernetes container orchestrator to build Kube- Knots. Kube-Knots can dynamically harvest spare compute cycles through dynamic container orchestration enabling co-location of latency-critical and batch workloads together while improving the overall resource utilization. We design and evaluate two GPU-based scheduling techniques to schedule datacenter-scale workloads through Kube-Knots on a ten node GPU cluster. Our proposed Correlation Based Prediction (CBP) and Peak Prediction (PP) schemes together improves both average and 99 th percentile cluster-wide GPU utilization by up to 80% in case of HPC workloads. In addition, CBP+PP improves the average job completion times (JCT) of deep learning workloads by up to 36% when compared to state-of-the-art schedulers. This leads to 33% cluster-wide energy savings on an average for three different workloads compared to state-of-the-art GPU-agnostic schedulers. Further, the proposed PP scheduler guarantees the end-to-end QoS for latency-critical queries by reducing QoS violations by up to 53% when compared to state-of-the-art GPU schedulers. 
    more » « less