Oblivious routing has a long history in both the theory and practice of networking. In this work we initiate the formal study of oblivious routing in the context of reconfigurable networks, a new architecture that has recently come to the fore in datacenter networking. These networks allow a rapidly changing boundeddegree pattern of interconnections between nodes, but the network topology and the selection of routing paths must both be oblivious to the traffic demand matrix. Our focus is on the tradeoff between maximizing throughput and minimizing latency in these networks. For every constant throughput rate, we characterize (up to a constant factor) the minimum latency achievable by an oblivious reconfigurable network design that satisfies the given throughput guarantee. The tradeoff between these two objectives turns out to be surprisingly subtle: the curve depicting it has an unexpected scalloped shape reflecting the fact that loadbalancing becomes more difficult when the average length of routing paths is not an integer because equalizing all the path lengths is not possible. The proof of our lower bound uses LP duality to verify that Valiant load balancing is the most efficient oblivious routing scheme when used in combination with an optimallydesigned reconfigurable network topology. The proof of our upper bound uses an algebraic construction in which the network nodes are identified with vectors over a finite field, the network topology is described by either the elementary basis or a sequence of Vandermonde matrices, and routing paths are constructed by selecting columns of these matrices to yield the appropriate mixture of path lengths within the shortest possible time interval.
more »
« less
This content will become publicly available on June 28, 2025
Breaking the VLB Barrier for Oblivious Reconfigurable Networks
In a landmark 1981 paper, Valiant and Brebner gave birth to the study of oblivious routing and, simultaneously, introduced its most powerful and ubiquitous method: Valiant load balancing (VLB). By routing messages through a randomly sampled intermediate node, VLB lengthens routing paths by a factor of two but gains the crucial property of obliviousness: it balances load in a completely decentralized manner, with no global knowledge of the communication pattern. Forty years later, with datacenters handling workloads whose communication pattern varies too rapidly to allow centralized coordination, oblivious routing is as relevant as ever, and VLB continues to take center stage as a widely used — and in some settings, provably optimal — way to balance load in the network obliviously to the traffic demands. However, the ability of the network to rapidly reconfigure its interconnection topology gives rise to new possibilities.
In this work we revisit the question of whether VLB remains optimal in the novel setting of reconfigurable networks. Prior work showed that VLB achieves the optimal tradeoff between latency and guaranteed throughput. In this work we show that a strictly superior latencythroughput tradeoff is achievable when the throughput bound is relaxed to hold with high probability. The same improved tradeoff is also achievable with guaranteed throughput under timestationary demands, provided the latency bound is relaxed to hold with high probability and that the network is allowed to be semioblivious, using an oblivious (randomized) connection schedule but demandaware routing. We prove that the latter result is not achievable by any fullyoblivious reconfigurable network design, marking a rare case in which semioblivious routing has a provable asymptotic advantage over oblivious routing. Our results are enabled by a novel oblivious routing scheme that improves VLB by stretching routing paths the minimum possible amount — an additive stretch of 1 rather than a multiplicative stretch of 2 — yet still manages to balance load with high probability when either the traffic demand matrix or the network’s interconnection schedule are shuffled by a uniformly random permutation. To analyze our routing scheme we prove an exponential tail bound which may be of independent interest, concerning the distribution of values of a bilinear form on an orbit of a permutation group action.
more »
« less
 NSFPAR ID:
 10511250
 Publisher / Repository:
 Association of Computing Machinery (ACM)
 Date Published:
 Journal Name:
 Conference proceedings of the annual ACM Symposium on Theory of Computing
 ISSN:
 07349025
 ISBN:
 9798400703836
 Page Range / eLocation ID:
 18651876
 Subject(s) / Keyword(s):
 Oblivious routing reconfigurable networks tail inequalities valiant load balancing
 Format(s):
 Medium: X
 Location:
 Vancouver, Canada
 Sponsoring Org:
 National Science Foundation
More Like this


Valiant routing, the use of a random intermediate node to distribute network traffic, has been proposed for a number of recent HPC network topologies. It is also commonly used as a bulding block for adaptive routing algorithms, which use shortest path routes when possible, but revert to Valiant routing when necessary to avoid hot spots. We show that the version of Valiant routing proposed for the Slim fly topology can cause messages to follow loops, using an edge in both directions before returning to edges of the original shortest path. Removing these loops in the UGALL adaptive routing algorithm is shown to provide slight improvements in average latency and also allow the network to carry up to 12% more traffic before saturation.more » « less

Oblivious routing of network traffic uses predetermined paths that do not change with changing traffic patterns. It has the benefit of using a fixed network configuration while robustly handling a range of varying and unpredictable traffic. Theoretical advances have shown that the benefits of oblivious routing are achievable without compromising much capacity efficiency. For oblivious routing, we only assume knowledge of the ingress/egress capacities of the edge nodes through which traffic enters or leaves the network. All traffic patterns possible subject to the ingress/egress capacity constraints (also known as the hose constraints) are permissible and are to be handled using oblivious routing. We use the widely deployed segment routing method for route control. Furthermore, for ease of deployment and to not deviate too much from conventional shortest path routing, we restrict paths to be 2segment paths (the composition of two shortest path routed segments). We solve the 2segment oblivious routing problem for all permissible traffic matrices (which can be infinitelymany).We develop a new adversarial and machinelearning driven approach that uses an iterative gradient descent method to solve the routing problem with worstcase performance guarantees. Additionally, the parallelism involved in descent methods allows this method to scale well with the network size making it amenable for use in practice.more » « less

Circuitswitched technologies have long been proposed for handling highthroughput traffic in datacenter networks, but recent developments in nanosecondscale reconfiguration have created the enticing possibility of handling lowlatency traffic as well. The novel Oblivious Reconfigurable Network (ORN) design paradigm promises to deliver on this possibility. Prior work in ORN designs achieved latencies that scale linearly with system size, making them unsuitable for largescale deployments. Recent theoretical work showed that ORNs can achieve far better latency scaling, proposing theoretical ORN designs that are Pareto optimal in latency and throughput. In this work, we bridge multiple gaps between theory and practice to develop Shale, the first ORN capable of providing lowlatency networking at datacenter scale while still guaranteeing high throughput. By interleaving multiple Pareto optimal schedules in parallel, both latency and throughputsensitive flows can achieve optimal performance. To achieve the theoretical low latencies in practice, we design a new congestion control mechanism which is best suited to the characteristics of Shale. In datacenterscale packet simulations, our design compares favorably with both an innetwork congestion mitigation strategy, modern receiverdriven protocols such as NDP, and an idealized analog for senderdriven protocols. We implement an FPGAbased prototype of Shale, achieving orders of magnitude better resource scaling than existing ORN proposals. Finally, we extend our congestion control solution to handle node and link failures.more » « less

The Dragonfly networks have been adopted in the current supercomputers, and will be deployed in future generation supercomputers and data centers. Effective routing on Dragonfly is challenging. Universal Globally Adaptive Loadbalanced routing (UGAL) is the stateoftheart routing algorithm for Dragonfly. For each packet, UGAL selects either a minimal path or a nonminimal path based on their estimated latencies. Practical UGAL makes routing decisions with local information, deriving the estimated latency for each path from the local queue occupancy and path hop count information. In this work, we develop techniques to improve the accuracy of the latency estimation for UGAL with local information, which results in more effective routing decisions. In particular, our schemes are able to proactively mitigate the potential network congestion with imbalanced network traffic. Extensive simulation experiments using synthetic traffic patterns and application workloads demonstrate that our enhanced UGAL schemes significantly improve the routing performance for many common traffic conditions.more » « less