skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Leveraging SONiC Functionalities in Disaggregated Network Switches
Ever since the inception of the networking industry, routing and switching devices have been limited to tightly-coupled hardware and software components. Vendors provide closed source proprietary stacks, restraining network operators from utilizing customized features, and hence hindering innovation. This aggregated model is costly, time consuming, and unscalable as changes in the devices require vendor's intervention. As a result, the industry started manufacturing white-box switches and developing Network Operating Systems (NOSs) that support multiple vendors and Application Specific Integrated Circuits (ASICs). This model is referred to as ”disaggregated” as the software and hardware are decoupled; essentially, vendors' switching silicons (e.g., Broadcom) are compatible with different NOS (e.g., SONiC). In this paper, we discuss the lessons learned while designing and implementing a testbed that consists of disaggregated network devices. We iterate over several open source Internet Protocol (IP) routing suites and NOSs that are vendor-agnostic. Additionally, we highlight a novel type of forwarding data planes that are programmable and explore their features. The testbed consists of two white-box switches provided by Edgecore that use programmable switching silicon (Tofino) manufactured by Barefoot Networks, an Intel Company. We installed SONiC NOS on top of the switches and tested static and BGP routing protocols. We report the configuration process and the prerequisites needed to deploy a working disaggregated environment. Finally, we discuss how open source NOSs and programmable switches can be extended to support campus networks, rather than being data center-oriented only.  more » « less
Award ID(s):
1925484
PAR ID:
10252946
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
2020 43rd International Conference on Telecommunications and Signal Processing (TSP)
Page Range / eLocation ID:
457 to 460
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Network monitoring and measurement have always been critical components of network management. Recent developments in sketch-based monitoring techniques and the deployment opportunities arising from the increasing programmability of network elements (e.g., programmable switches, SmartNICs, and software switches) have made the possibility of accurate, detailed, network-wide telemetry tantalizingly within reach. However, the wide heterogeneity of the programmable hardware and dynamic changes in both resources available and resources needed for monitoring over time make existing approaches to network-wide monitoring impractical. We present HeteroSketch, a framework that consists of two main components: (1) a profiling tool that automatically quantifies the capabilities of arbitrary hardware by predicting their performance for sketching algorithms, and (2) an optimization framework that decides placement of measurement tasks and resource allocation for devices to meet monitoring goals while considering heterogeneous device capabilities. HeteroSketch enables optimized deployments for large networks (> 40,000 nodes) using a novel clustering approach and enables prompt responses to network topology, traffic, query, and resource dynamics. Our evaluation shows that HeteroSketch reduces resource overheads by 20-60% compared to prior art, while maintaining monitoring performance, coverage, and accuracy. 
    more » « less
  2. Network monitoring and measurement have always been critical components of network management. Recent developments in sketch-based monitoring techniques and the deployment opportunities arising from the increasing programmability of network elements (e.g., programmable switches, SmartNICs, and software switches) have made the possibility of accurate, detailed, network-wide telemetry tantalizingly within reach. However, the wide heterogeneity of the programmable hardware and dynamic changes in both resources available and resources needed for monitoring over time make existing approaches to network-wide monitoring impractical. We present HeteroSketch, a framework that consists of two main components: (1) a profiling tool that automatically quantifies the capabilities of arbitrary hardware by predicting their performance for sketching algorithms, and (2) an optimization framework that decides placement of measurement tasks and resource allocation for devices to meet monitoring goals while considering heterogeneous device capabilities. HeteroSketch enables optimized deployments for large networks (> 40,000 nodes) using a novel clustering approach and enables prompt responses to network topology, traffic, query, and resource dynamics. Our evaluation shows that HeteroSketch reduces resource overheads by 20-60% compared to prior art, while maintaining monitoring performance, coverage, and accuracy. 
    more » « less
  3. Network monitoring and measurement have always been critical components of network management. Recent developments in sketch-based monitoring techniques and the deployment opportunities arising from the increasing programmability of network elements (e.g., programmable switches, SmartNICs, and software switches) have made the possibility of accurate, detailed, network-wide telemetry tantalizingly within reach. However, the wide heterogeneity of the programmable hardware and dynamic changes in both resources available and resources needed for monitoring over time make existing approaches to network-wide monitoring impractical. We present HeteroSketch, a framework that consists of two main components: (1) a profiling tool that automatically quantifies the capabilities of arbitrary hardware by predicting their performance for sketching algorithms, and (2) an optimization framework that decides placement of measurement tasks and resource allocation for devices to meet monitoring goals while considering heterogeneous device capabilities. HeteroSketch enables optimized deployments for large networks (> 40,000 nodes) using a novel clustering approach and enables prompt responses to network topology, traffic, query, and resource dynamics. Our evaluation shows that HeteroSketch reduces resource overheads by 20−60% compared to prior art, while maintaining monitoring performance, coverage, and accuracy. 
    more » « less
  4. Bursts, sudden surges in network utilization, are a significant root cause of packet loss and high latency in datacenters. Packet deflection, re-routing packets that arrive at a local hotspot to neighboring switches, is shown to be a potent countermeasure against bursts. Unfortunately, existing deflection techniques cannot be implemented in today's datacenter switches. This is because, to minimize packet drops and remain effective under extreme load, existing deflection techniques rely on certain hardware primitives (e.g., extracting packets from arbitrary locations in the queue) that datacenter switches do not support. In this paper, we address the implementability hurdles of packet deflection. This paper proposes heuristics for approximating state-of-the-art deflection techniques in programmable switches. We introduce Simple Deflection which deflects excess traffic to randomly selected, non-congested ports and Preemptive Deflection (PD) in which switches identify the packets likely to be selected for deflection and preemptively deflect them before they are enqueued. We implement and evaluate our techniques on a testbed with Intel Tofino switches. Our testbed evaluations show that Simple and Preemptive Deflection improve the 99th percentile response times by 8× and 425×, respectively, compared to a baseline drop-tail queue under 90% load. Using large-scale network simulations, we show that the performance of our algorithms is close to the deflection techniques that they intend to approximate, e.g., PD achieves 4% lower 99th percentile query completion times (QCT) than Vertigo, a recent deflection technique that cannot be implemented in off-the-shelf switches, and 2.5× lower QCT than ECMP under 95% load. 
    more » « less
  5. P4 serves as a programming language for configuring flexible and programmable network data planes, facilitating the development of custom protocols and programmable switches, and driving innovation in software-defined networking and network function virtualization. While the Linux container based network emulator, Mininet, coupled with the BMv2 software P4 switch, is widely used for rapid prototyping of P4-based applications, BMv2’s diminished performance raises fidelity concerns under high traffic and large network scenarios. In this paper, we introduce a lightweight virtual time system integrated into Mininet with BMv2 to enhance fidelity and scalability. By applying a time dilation factor (TDF) to interactions between containers and the physical machine, we optimize the emulated P4 network’s perceived speed from the application processes’ perspective. System evaluation demonstrates accurate emulation of significantly larger networks under high loads with minimal system overhead. We showcase our system’s utility through two network applications: an emulation of a TCP SYN flood attack and an ECMP load balancer. Evaluating against a production-grade software switch, Open vSwitch, and a physical testbed, we highlight the virtual time system’s improvement in temporal fidelity despite the observed performance degradation in BMv2 software switches. 
    more » « less