skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Annulus: A Dual Congestion Control Loop for Datacenter and WAN Traffic Aggregates
Cloud services are deployed in datacenters connected though high-bandwidth Wide Area Networks (WANs). We find that WAN traffic negatively impacts the performance of datacenter traffic, increasing tail latency by 2.5x, despite its small bandwidth demand. This behavior is caused by the long round-trip time (RTT) for WAN traffic, combined with limited buffering in datacenter switches. The long WAN RTT forces datacenter traffic to take the full burden of reacting to congestion. Furthermore, datacenter traffic changes on a faster time-scale than the WAN RTT, making it difficult for WAN congestion control to estimate available bandwidth accurately. We present Annulus, a congestion control scheme that relies on two control loops to address these challenges. One control loop leverages existing congestion control algorithms for bottlenecks where there is only one type of traffic (i.e., WAN or datacenter). The other loop handles bottlenecks shared between WAN and datacenter traffic near the traffic source, using direct feedback from the bottleneck. We implement Annulus on a testbed and in simulation. Compared to baselines using BBR for WAN congestion control and DCTCP or DCQCN for datacenter congestion control, Annulus increases bottleneck utilization by 10% and lowers datacenter flow completion time by 1.3-3.5x.  more » « less
Award ID(s):
1816331
PAR ID:
10166612
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Proceedings of ACM SIGCOMM
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    The alpha version of Bottleneck Bandwidth and Round-trip Time version 2 (BBRv2) has been recently presented, which aims to mitigate the shortcomings of its predecessor, BBR version 1 (BBRv1). Previous studies show that BBRv1 provides a high link utilization and low queuing delay by estimating the available bottleneck bandwidth. However, its aggressiveness induces unfairness when flows i) use different congestion control algorithms, such as CUBIC, and ii) have distinct round-trip times (RTTs). This paper presents an experimental evaluation of BBRv2, using Mininet. Results show that the coexistence between BBRv2-CUBIC is enhanced with respect to that of BBRv1-CUBIC, as measured by the fairness index. They also show that BBRv2 mitigates the RTT unfairness problem observed in BBRv1. Additionally, BBRv2 achieves a better fair share of the bandwidth than its predecessor when network conditions such as bandwidth and latency dynamically change. Results also indicate that the average flow completion time of concurrent flows is reduced when BBRv2 is used. 
    more » « less
  2. BBR is a new congestion control algorithm proposed by Google that builds a model of the network path consisting of its bottleneck bandwidth and RTT to govern its sending rate rather than packet loss (like CUBIC and many other popular congestion control algorithms). Loss-based congestion control has been shown to be vulnerable to acknowledgment manipulation attacks. However, no prior work has investigated how to design such attacks for BBR, nor how effective they are in practice. In this paper we systematically analyze the vulnerability of BBR to acknowledgement manipulation attacks. We create the first detailed BBR finite state machine and a novel algorithm for inferring its current BBR state at runtime by passively observing network traffic.We then adapt and apply a TCP fuzzer to the Linux TCP BBR v1.0 implementation. Our approach generated 30,297 attack strategies, of which 8,859 misled BBR about actual network conditions. From these, we identify 5 classes of attacks causing BBR to send faster, slower or stall. We also found that BBR is immune to acknowledgment burst, division and duplication attacks that were previously shown to be effective against loss-based congestion control such as TCP New Reno. 
    more » « less
  3. This work investigates traffic control via controlled connected and automated vehicles (CAVs) using novel controllers derived from the linear-quadratic regulator (LQR) theory. CAV-platoons are modeled as moving bottlenecks impacting the surrounding traffic with their speeds as control inputs. An iterative controller algorithm based on the LQR theory is proposed along with a variant that allows for penalizing abrupt changes in platoon speeds. The controllers use the Lighthill-Whitham-Richards (LWR) model implemented using an extended cell transmission model (CTM) which considers the capacity drop phenomenon for a realistic representation of traffic in congestion. The impact of various parameters of the proposed controller on the control performance is analyzed. The effectiveness of the proposed traffic control algorithms is tested using a traffic control example and compared with existing proportional-integral (PI) and model predictive control (MPC) controllers from the literature. A case study using the TransModeler traffic microsimulation software is conducted to test the usability of the proposed controller as well as existing controllers in a realistic setting and derive qualitative insights. It is observed that the proposed controller works well in both settings to mitigate the impact of the jam caused by a fixed bottleneck. The computation time required by the controller is also small making it suitable for real-time control. 
    more » « less
  4. Circuit-switched technologies have long been proposed for handling high-throughput traffic in datacenter networks, but recent developments in nanosecond-scale reconfiguration have created the enticing possibility of handling low-latency traffic as well. The novel Oblivious Reconfigurable Network (ORN) design paradigm promises to deliver on this possibility. Prior work in ORN designs achieved latencies that scale linearly with system size, making them unsuitable for large-scale deployments. Recent theoretical work showed that ORNs can achieve far better latency scaling, proposing theoretical ORN designs that are Pareto optimal in latency and throughput. In this work, we bridge multiple gaps between theory and practice to develop Shale, the first ORN capable of providing low-latency networking at datacenter scale while still guaranteeing high throughput. By interleaving multiple Pareto optimal schedules in parallel, both latency- and throughput-sensitive flows can achieve optimal performance. To achieve the theoretical low latencies in practice, we design a new congestion control mechanism which is best suited to the characteristics of Shale. In datacenter-scale packet simulations, our design compares favorably with both an in-network congestion mitigation strategy, modern receiver-driven protocols such as NDP, and an idealized analog for sender-driven protocols. We implement an FPGA-based prototype of Shale, achieving orders of magnitude better resource scaling than existing ORN proposals. Finally, we extend our congestion control solution to handle node and link failures. 
    more » « less
  5. Recent years have seen a slew of papers on datacenter congestion control mechanisms. In this editorial, we ask whether the bulk of this research is needed for the common case where congestion control involves hosts responding to simple congestion signals from the network and the performance goal is reducing some average measure of Flow Completion Time. We raise this question because we find that, out of all the possible variations one could make in congestion control algorithms, the most essential feature is the switch scheduling algorithm. More specifically, we find that congestion control mechanisms that use Shortest-Remaining-Processing-Time (SRPT) achieve superior performance as long as the rate-setting algorithm at the host is reasonable. We further find that while SRPT’s performance is quite robust to host behaviors, the performance of schemes that use scheduling algorithms like FIFO or Fair Queuing depend far more crucially on the rate-setting algorithm, and their performance is typically worse than what can be achieved with SRPT. Given these findings, we then ask whether it is practical to realize SRPT in switches without requiring custom hardware. We observe that approximate and deployable SRPT (ADS) designs exist, which leverage the small number of priority queues supported in almost all commodity switches, and require only software changes in the host and the switches. Our evaluations with one very simple ADS design shows that it can achieve performance close to true SRPT and is significantly better than FIFO. Thus, the answer to our basic question – whether the bulk of recent research on datacenter congestion control algorithms is needed for the common case – is no. 
    more » « less