skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Learning an Adaptive Forwarding Strategy for Mobile Wireless Networks: Resource Usage vs. Latency
Mobile wireless networks present several challenges for any learning system, due to uncertain and variable device movement, a decentralized network architecture, and constraints on network resources. In this work, we use deep reinforcement learning (DRL) to learn a scalable and generalizable forwarding strategy for such networks. We make the following contributions: i) we use hierarchical RL to design DRL packet agents rather than device agents, to capture the packet forwarding decisions that are made over time and improve training efficiency; ii) we use relational features to ensure generalizability of the learned forwarding strategy to a wide range of network dynamics and enable offline training; and iii) we incorporate both forwarding goals and network resource considerations into packet decision-making by designing a weighted DRL reward function. Our results show that our DRL agent often achieves a similar delay per packet delivered as the optimal forwarding strategy and outperforms all other strategies including state-of-the-art strategies, even on scenarios on which the DRL agent was not trained.  more » « less
Award ID(s):
2154190
PAR ID:
10407060
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Reinforcement Learning for Real Life (RL4RealLife) Workshop in the 36th Conference on Neural Information Processing Systems (NeurIPS)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Narodytska, Nina; Ruemmer, Philipp (Ed.)
    Deep reinforcement learning (DRL) is a powerful machine learning paradigm for generating agents that control autonomous systems. However, the “black box” nature of DRL agents limits their deployment in real-world safety-critical applications. A promising approach for providing strong guarantees on an agent's behavior is to use Neural Lyapunov Barrier (NLB) certifcates, which are learned functions over the system whose properties indirectly imply that an agent behaves as desired. However, NLB-based certifcates are typically diffcult to learn and even more diffcult to verify, especially for complex systems. In this work, we present a novel method for training and verifying NLB-based certifcates for discrete-time systems. Specifcally, we introduce a technique for certifcate composition, which simplifes the verifcation of highly-complex systems by strategically designing a sequence of certifcates. When jointly verifed with neural network verifcation engines, these certifcates provide a formal guarantee that a DRL agent both achieves its goals and avoids unsafe behavior. Furthermore, we introduce a technique for certifcate fltering, which signifcantly simplifes the process of producing formally verifed certifcates. We demonstrate the merits of our approach with a case study on providing safety and liveness guarantees for a DRL-controlled spacecraft. 
    more » « less
  2. Jara, C; Borras_Sol, J (Ed.)
    Deep Reinforcement Learning (DRL) has shown its capability to solve the high degrees of freedom in control and the complex interaction with the object in the multi-finger dexterous in-hand manipulation tasks. Current DRL approaches lack behavior constraints during the learning process, leading to aggressive and unstable policies that are insufficient for safety-critical in-hand manipulation tasks. The centralized learning strategy also limits the flexibility to fine-tune each robot finger's behavior. This work proposes the Finger-specific Multi-agent Shadow Critic Consensus (FMSC) method, which models the in-hand manipulation as a multi-agent collaboration task where each finger is an individual agent and trains the policies for the fingers to achieve a consensus across the critic networks through the Information Sharing (IS) across the neighboring agents and finger-specific stable manipulation objectives based on the state-action occupancy measure, a general utility of DRL that is approximated during the learning process. The methods are evaluated in two in-hand manipulation tasks on the Shadow Hand. The results show that FMSC+IS converges faster in training, achieving a comparable success rate and much better manipulation stability than conventional DRL methods. 
    more » « less
  3. The Open Radio Access Network (RAN) paradigm is transforming cellular networks into a system of disaggregated, virtualized, and software-based components. These self-optimize the network through programmable, closed-loop control, leveraging Artificial Intelligence (AI) and Machine Learning (ML) routines. In this context, Deep Reinforcement Learning (DRL) has shown great potential in addressing complex resource allocation problems. However, DRL-based solutions are inherently hard to explain, which hinders their deployment and use in practice. In this paper, we propose EXPLORA, a framework that provides explainability of DRL-based control solutions for the Open RAN ecosystem. EXPLORA synthesizes network-oriented explanations based on an attributed graph that produces a link between the actions taken by a DRL agent (i.e., the nodes of the graph) and the input state space (i.e., the attributes of each node). This novel approach allows EXPLORA to explain models by providing information on the wireless context in which the DRL agent operates. EXPLORA is also designed to be lightweight for real-time operation. We prototype EXPLORA and test it experimentally on an O-RAN-compliant near-real-time RIC deployed on the Colosseum wireless network emulator. We evaluate EXPLORA for agents trained for different purposes and showcase how it generates clear network-oriented explanations. We also show how explanations can be used to perform informative and targeted intent-based action steering and achieve median transmission bitrate improvements of 4% and tail improvements of 10%. 
    more » « less
  4. null (Ed.)
    Machine learning applied to architecture design presents a promising opportunity with broad applications. Recent deep reinforcement learning (DRL) techniques, in particular, enable efficient exploration in vast design spaces where conventional design strategies may be inadequate. This paper proposes a novel deep reinforcement framework, taking routerless networks-on-chip (NoC) as an evaluation case study. The new framework successfully resolves problems with prior design approaches, which are either unreliable due to random searches or inflexible due to severe design space restrictions. The framework learns (near-)optimal loop placement for routerless NoCs with various design constraints. A deep neural network is developed using parallel threads that efficiently explore the immense routerless NoC design space with a Monte Carlo search tree. Experimental results show that, compared with conventional mesh, the proposed deep reinforcement learning (DRL) routerless design achieves a 3.25x increase in throughput, 1.6x reduction in packet latency, and 5x reduction in power. Compared with the state-of-the-art routerless NoC, DRL achieves a 1.47x increase in throughput, 1.18x reduction in packet latency, 1.14x reduction in average hop count, and 6.3% lower power consumption. 
    more » « less
  5. With the commercialization and deployment of 5G, efforts are beginning to explore the design of the next generation of cellular networks, called 6G. New and constantly evolving use cases continue to place performance demands, especially for low latency communications, as these are still challenges for the 3GPP-specified 5G design, and will have to be met by the 6G design. Therefore, it is helpful to re-examine several aspects of the current cellular network’s design and implementation.Based on our understanding of the 5G cellular network specifications, we explore different implementation options for a dis-aggregated 5G core and their performance implications. To improve the data plane performance, we consider advanced packet classification mechanisms to support fast packet processing in the User Plane Function (UPF), to improve the poor performance and scalability of the current design based on linked lists. Importantly, we implement the UPF function on a SmartNIC for forwarding and tunneling. The SmartNIC provides the fastpath for device traffic, while more complex functions of buffering and processing flows that suffer a miss on the SmartNIC P4 tables are processed by the host-based UPF. Compared to an efficient DPDK-based host UPF, the SmartNIC UPF increases the throughput for 64 Byte packets by almost 2×. Furthermore, we lower the packet forwarding latency by 3.75× by using the SmartNIC. In addition, we propose a novel context-level QoS mechanism that dynamically updates the Packet Detection Rule priority and resource allocation of a flow based on the user context. By combining our innovations, we can achieve low latency and high throughput that will help us evolve to the next generation 6G cellular networks. 
    more » « less