skip to main content


Title: Coordinating rolling software upgrades for cellular networks
Cellular service providers continuously upgrade their network software on base stations to introduce new service features, fix software bugs, enhance quality of experience to users, or patch security vulnerabilities. A software upgrade typically requires the network element to be taken out of service, which can potentially degrade the service to users. Thus, the new software is deployed across the network using a rolling upgrade model such that the service impact during the roll-out is minimized. A sequential roll-out guarantees minimal impact but increases the deployment time thereby incurring a significant human cost and time in monitoring the upgrade. A network-wide concurrent roll-out guarantees minimal deployment time but can result in a significant service impact. The goal is to strike a balance between deployment time and service impact during the upgrade. In this paper, we first present our findings from analyzing upgrades in operational networks and discussions with network operators and exposing the challenges in rolling software upgrades. We propose a new framework Concord to effectively coordinate software upgrades across the network that balances the deployment time and service impact. We evaluate Concord using real-world data collected from a large operational cellular network and demonstrate the benefits and tradeoffs. We also present a prototype deployment of Concord using a small-scale LTE testbed deployed indoors in a corporate building.  more » « less
Award ID(s):
1718089
NSF-PAR ID:
10075828
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
International Conference on Network Protocols
Page Range / eLocation ID:
1 to 10
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Cellular networks are constantly evolving due to frequent changes in radio access and end user equipment technologies, dynamic applications and associated traffic mixes. Network upgrades should be performed with extreme caution since millions of users heavily depend on the cellular networks for a wide range of day to day tasks, including emergency and alert notifications. Before upgrading the entire network, it is important to conduct field evaluation of upgrades. Field evaluations are typically cumbersome and can be time consuming; however if done correctly they can help alleviate a lot of the deployment issues in terms of service quality degradation. The choice and number of field test locations have significant impacts on the time-to-market as well as confidence in how well various network upgrades will work out in the rest of the network. In this paper, we propose a novel approach – Reflection to automatically determine where to conduct the upgrade field tests in order to accurately identify important features that affect the upgrade. We demonstrate the effectiveness of Reflection using extensive evaluation based on real traces collected from a major US cellular network as well as synthetic traces. 
    more » « less
  2. null (Ed.)
    One of the most costly factors in providing a global computing infrastructure such as the WLCG is the human effort in deployment, integration, and operation of the distributed services supporting collaborative computing, data sharing and delivery, and analysis of extreme scale datasets. Furthermore, the time required to roll out global software updates, introduce new service components, or prototype novel systems requiring coordinated deployments across multiple facilities is often increased by communication latencies, staff availability, and in many cases expertise required for operations of bespoke services. While the WLCG (and distributed systems implemented throughout HEP) is a global service platform, it lacks the capability and flexibility of a modern platform-as-a-service including continuous integration/continuous delivery (CI/CD) methods, development-operations capabilities (DevOps, where developers assume a more direct role in the actual production infrastructure), and automation. Most importantly, tooling which reduces required training, bespoke service expertise, and the operational effort throughout the infrastructure, most notably at the resource endpoints (sites), is entirely absent in the current model. In this paper, we explore ideas and questions around potential NoOps models in this context: what is realistic given organizational policies and constraints? How should operational responsibility be organized across teams and facilities? What are the technical gaps? What are the social and cybersecurity challenges? Conversely what advantages does a NoOps model deliver for innovation and for accelerating the pace of delivery of new services needed for the HL-LHC era? We will describe initial work along these lines in the context of providing a data delivery network supporting IRIS-HEP DOMA R&D. 
    more » « less
  3. Recent Internet-of-Things (IoT) networks span across a multitude of stationary and robotic devices, namely unmanned ground vehicles, surface vessels, and aerial drones, to carry out mission-critical services such as search and rescue operations, wildfire monitoring, and flood/hurricane impact assessment. Achieving communication synchrony, reliability, and minimal communication jitter among these devices is a key challenge both at the simulation and system levels of implementation due to the underpinning differences between a physics-based robot operating system (ROS) simulator that is time-based and a network-based wireless simulator that is event-based, in addition to the complex dynamics of mobile and heterogeneous IoT devices deployed in a real environment. Nevertheless, synchronization between physics (robotics) and network simulators is one of the most difficult issues to address in simulating a heterogeneous multi-robot system before transitioning it into practice. The existing TCP/IP communication protocol-based synchronizing middleware mostly relied on Robot Operating System 1 (ROS1), which expends a significant portion of communication bandwidth and time due to its master-based architecture. To address these issues, we design a novel synchronizing middleware between robotics and traditional wireless network simulators, relying on the newly released real-time ROS2 architecture with a master-less packet discovery mechanism. Additionally, we propose a ground and aerial agents’ velocity-aware customized QoS policy for Data Distribution Service (DDS) to minimize the packet loss and transmission latency between a diverse set of robotic agents, and we offer the theoretical guarantee of our proposed QoS policy. We performed extensive network performance evaluations both at the simulation and system levels in terms of packet loss probability and average latency with line-of-sight (LOS) and non-line-of-sight (NLOS) and TCP/UDP communication protocols over our proposed ROS2-based synchronization middleware. Moreover, for a comparative study, we presented a detailed ablation study replacing NS-3 with a real-time wireless network simulator, EMANE, and masterless ROS2 with master-based ROS1. Our proposed middleware attests to the promise of building a largescale IoT infrastructure with a diverse set of stationary and robotic devices that achieve low-latency communications (12% and 11% reduction in simulation and reality, respectively) while satisfying the reliability (10% and 15% packet loss reduction in simulation and reality, respectively) and high-fidelity requirements of mission-critical applications. 
    more » « less
  4. Privacy technologies support the provision of online services while protecting user privacy. Cryptography lies at the heart of many such technologies, creating remarkable possibilities in terms of functionality while offering robust guarantees of data confidential- ity. The cryptography literature and discourse often represent that these technologies eliminate the need to trust service providers, i.e., they enable users to protect their privacy even against untrusted service providers. Despite their apparent promise, privacy technolo- gies have seen limited adoption in practice, and the most successful ones have been implemented by the very service providers these technologies purportedly protect users from. The adoption of privacy technologies by supposedly adversarial service providers highlights a mismatch between traditional models of trust in cryptography and the trust relationships that underlie deployed technologies in practice. Yet this mismatch, while well known to the cryptography and privacy communities, remains rela- tively poorly documented and examined in the academic literature— let alone broader media. This paper aims to fill that gap. Firstly, we review how the deployment of cryptographic tech- nologies relies on a chain of trust relationships embedded in the modern computing ecosystem, from the development of software to the provision of online services, that is not fully captured by tra- ditional models of trust in cryptography. Secondly, we turn to two case studies—web search and encrypted messaging—to illustrate how, rather than removing trust in service providers, cryptographic privacy technologies shift trust to a broader community of secu- rity and privacy experts and others, which in turn enables service providers to implicitly build and reinforce their trust relationship with users. Finally, concluding that the trust models inherent in the traditional cryptographic paradigm elide certain key trust relation- ships underlying deployed cryptographic systems, we highlight the need for organizational, policy, and legal safeguards to address that mismatch, and suggest some directions for future work. 
    more » « less
  5. null (Ed.)
    Edge and fog computing encompass a variety of technologies that are poised to enable new applications across the Internet that support data capture, storage, processing, and communication across the networking continuum. These environments pose new challenges to the design and implementation of networks-as membership can be dynamic and devices are heterogeneous, widely distributed geographically, and in proximity to end-users, as is the case with mobile and Internet-of-Things (IoT) devices. We present a demonstration of EdgeVPN.io (Evio for short), an open-source programmable, software-defined network that addresses challenges in the deployment of virtual networks spanning distributed edge and cloud resources, in particular highlighting its use in support of the Kubernetes container orchestration middleware. The demo highlights a deployment of unmodified Kubernetes middleware across a virtual cluster comprising virtual machines deployed both in cloud providers, and in distinct networks at the edge-where all nodes are assigned private IP addresses and subject to different NAT (Network Address Translation) middleboxes, connected through an Evio virtual network. The demo includes an overview of the configuration of Kubernetes and Evio nodes and the deployment of Docker-based container pods, highlighting the seamless connectivity for TCP/IP applications deployed on the pods. 
    more » « less